Reading data from on prem data lake to cloud storage in order to utilize cloud computing for resource heavy operations regarding NLP and ML (<10GB Total). Trying to decide if we need to utilize Google BigQuery here or if we can work directly form Google Cloud Storage with a DataProc cluster. Any thoughts here would be appreciated in regards to which would be a better approach. Thanks!
4 upvotes·28.1K views
Replies (4)
BigQuery's cost is the same as cloud storage for the storage. The cost is during the query. If you have clean data and structure, store it directly in bigquery this will be way more easier. If you have messy data or if you need to enrich them dataproc is for you
2 upvotes·3.3K views
View all (4)