What do you think is/are the best DB(s) for my use case? Do you have any other ideas like Cloud Firestore or MongoDB?
Here is the project: https://1drv.ms/w/s!Aryf65kIpgMPp_VQzKCJWCUzU8esTA

What do you think is/are the best DB(s) for my use case? Do you have any other ideas like Cloud Firestore or MongoDB?
Here is the project: https://1drv.ms/w/s!Aryf65kIpgMPp_VQzKCJWCUzU8esTA
I'm wondering if any Cloud Firestore users might be open to sharing some input and challenges encountered when trying to create a low-cost, low-latency data pipeline to their Analytics warehouse (e.g. Google BigQuery, Snowflake, etc...)
I'm working with a platform by the name of Estuary.dev, an ETL/ELT and we are conducting some research on the pain points here to see if there are drawbacks of the Firestore->BQ extension and/or if users are seeking easy ways for getting nosql->fine-grained tabular data
Please feel free to drop some knowledge/wish list stuff on me for a better pipeline here!
Good day, everyone!
We recently got the request to add a feature to our systems: Forms should be saved to be continued later given the following cases:
If the connection to the server is lost, if the user closes their browser, if the user changes networks, if the user didn't finish the form and decides to take a week off and a mix of them all.
One of the best solutions we've thought so far is to save the data in the forms in a remote database every so minutes directly from the client.
A document-oriented database seems the best bet, but which database to use is the question. Since we already use Google Cloud Platform, Cloud Firestore is a "safe" option right now, but we are looking at every option we can find. The "caveat" of our use case is that we need more writes than reads, but writes are usually more expensive, and our biggest constraint is budget.
That's why I am seeking advice. What other options do we have?
In case you're wondering, we seek to save the forms bypassing the backend and our own system resources entirely, so a dedicated, remote database or Local Storage are our best options, but Local Storage is a bit controversial choice within the team, so we will explore that options later on.
Any advice or experience in this matter is highly appreciated!
You can give ClientDB a try. It simplifies storing data local, and it takes care about syncing it regularly to your remote data storage.
Advantages:
Downside:
Thanks for the feedback. Considered this system for saving data
I don’t have time to go into a deeper analysis, but consider that with Cloud Firestore projects that I’ve had in the past with heavy write load, my cost was not on the writes: it was in the storage. So maybe consider that in our cost analysis - write over time + storage cost over time. Storage costs tend to be exponential and never go down unless you manually clean up your database. With Cloud Firestore this is harder than with MongoDB atlas, that has TTL for documents. By the way, I’m basing myself over an experience with Firestore from about two years ago, you might wanna check what’s the situation nowadays when it comes to archiving / deleting old data and storage costs. And last but no least, consider both Firestore and DataStore mode in your analysis. I thought there was no advantage for my use case but DataStore mode was actually more suitable at the time for a huge number of writes and deletes.
Also be mindful of performance during writes. I’ve had a bunch of performance issues with FireStore due to heavy write load and had to to batch writes. I would personally chose MongoDB Atlas as I trust MongoDB a tad more as a product, company, and always have the chance to spin up my own Mongo if costs get too high in a cloud platform.
Thanks! I truly didn't think about the storage cost. We do plan to deleting records with no activity in a week or so and whenever the form if fulfilled, likely on non-working hours. That being said, I will add the storage cost measure into consideration nevertheless. Is something that will add up, whether we clean the DB or not.
Storage cost was a huge deal for me there - but I was in scale of 8MM events per day, so it truly depends on the size of your data - but for this size of data that can be deleted - TTL for documents is a must unless you wanna write your own batch jobs to delete data 💀
I am having a mid size React Native project, I work with SQL databases but I am thinking about moving to NoSQL, I tried to learn NoSQL databases like Cloud Firestore but I faced issues at building the relation between models, also I noticed that the data is being duplicated in many places, should I stick with SQL databases like PostgreSQL or try again with Firebase.
Moving from a SQL database to a noSQL database is a really big deal, congrats 👏! In the current industry having handy knowledge of both is really important and I would recommend continuing to learn Firebase even if it feels a little unnatural.
When using noSQL databases you have consider the idea that the models you work with are individual documents. The data in the documents can be duplicated, re-referenced, and completely dissimilar to each other-the only thing differentiating two documents is an ID value. It's normal for data to be duplicated if you don't perform operations on the same document you already created. You stop unnecessary duplication by wrapping your document creation in a function that checks for a certain value (say if you don't want a username value to appear twice you would check if your collection of documents already has one with a username value and then throw an error).
Creating relations in a noSQL database would be done by referencing a different document in another collection (typically with the aforementioned ID value) and then going and finding that document in that collection. Say you want to find a user's posts. In the User document you would have an array of references to different documents in a Post collection, and then you would go to the Post collection and find each document that has a matching ID number to the ones stored in the User's references.
All in all, you have to realize you are working with two completely different ways of storing and relating data, similar to learning a new and completely different language. If you stick too it and don't try to force SQL onto noSQL then you will get the hang of it and it will be very rewarding.
I am considering choosing Cloud Firestore Database As Primary Database for my app that needs huge storage. I'm still a bit confused about my choice because it's not used by big companies as a database and it's not too scalable. But for the free tier and pay as you go, it seems like a good budget deal. Please recommend some other databases required for my situation: Low Cost, Highly Scalable, Easy To Manage, Good Storage.
We are building a social media app, where users will post images, like their post, and make friends based on their interest. We are currently using Cloud Firestore and Firebase Realtime Database. We are looking for another database like Amazon DynamoDB; how much this decision can be efficient in terms of pricing and overhead?
Hi, Akash,
I wouldn't make this decision without lots more information. Cloud Firestore has a much richer metamodel (document-oriented) than Dynamo (key-value), and Dynamo seems to be particularly restrictive. That is why it is so fast. There are many needs in most applications to get lightning access to the members of a set, one set at a time. Dynamo DB is a great choice. But, social media applications generally need to be able to make long traverses across a graph. While you can make almost any metamodel act like another one, with your own custom layers on top of it, or just by writing a lot more code, it's a long way around to do that with simple key-value sets. It's hard enough to traverse across networks of collections in a document-oriented database. So, if you are moving, I think a graph-oriented database like Amazon Neptune, or, if you might want built-in reasoning, Allegro or Ontotext, would take the least programming, which is where the most cost and bugs can be avoided. Also, managed systems are also less costly in terms of people's time and system errors. It's easier to measure the costs of managed systems, so they are often seen as more costly.
I planned to do a project in Cloud Firestore, which will store about 100GB of data. Shall I go for Cloud Firestore or traditional AWS RDS MS-SQL SERVER with AWS Lambda? Please I need your suggestion.
I'm researching what Technology Stack I should use to build my product (something like food delivery App) for Web, iOS, and Android Apps. Please advise which technologies you would recommend from a Scalability, Reliability, Cost, and Efficiency standpoint for a start-up. Here are the technologies I came up with, feel free to suggest any new technology even it's not in the list below.
For Mobile Apps -
For UI -
For Back-End or APIs -
For Database -
Thanks!
My Recommendations: Front End: Flutter because of developer tooling and powerful declarative widget system Back End: Node.js or Go because Node.js has a large ecosystem and Go has a good built in http setup Database: Cloud Firestore because of ease of use, NoSQL, and the ability to set data from the client
Thanks, since Google cloud Firestore is a NoSql database, I'm wondering how does it work for an app where it does daily transactions in a user checkout flow, etc.. ?
I'm not entirely sure what the question is about, as I don't see any problem using Cloud Firestore for transactions, but here is a use case for using Firestore with stripe: https://firebase.google.com/docs/use-cases/payments
If you go with react / react native I advice you to go with node. Why ? I first didn't believe coding in javaScript everywhere (back, front and db queries) was making life SO much more easy. I still followed the advice, in the end this is a huge relief. For a small startup project with 1/2/3 devs, using only one langage increases efficiency a lot. You can switch very fast from a topic to another.
I want to develop a mobile app with Cloud Firestore as backend. It's good until I realize need to implement FullTextSearch, and Firestore doesn't support it natively. Although they advise to use Algolia but at this time I'm not willing to pay for it.
Therefore, I'm thinking about using the built-in tool from Google Cloud. Since this app is online-only, the offline & sync are not a top priority, so how about use Google Cloud SQL? Or, do you recommend any stack for me? Thanks for your advice.
In the early days, people would set up their own Elastic Search clusters and have troubles maintaining it, keeping it secure, also this required a lot of manual work to get the data in, keep it current, query it... etc. A couple of years ago AWS came up with AWS Elastic Search Service, which reduced some of this overhead. My previous teams and I went through all of these different stages, and ultimately, discovering Algolia a couple of years ago, solved so many of our issues, kept the cost low, reduced dramatically Implementation therefore GTM timelines, and freed up so much of our engineering bandwidth. They have SDKs for most common languages and platforms, and you can achieve a complete solution in just a matter of hours if not minute. Value your own/your team's engineering time. Factor that in when comparing costs. There should also be entry-level tiers you can get a proof of concept rolled out with.
Since SQL Databases have a support for Full Text Search inbuilt into them, they are a better choice. With Firestore you can implement a full text search but it will still cost more reads than it would have otherwise, and also you'll need to enter and index the data in a particular way, So in this approach you can use firebase cloud functions to tokenise and then hash your input text while choosing a linear hash function h(x) that satisfies the following - if x < y < z then h(x) < h (y) < h(z). For tokenisation you can choose some lightweight NLP Libraries in order to keep the cold start time of your function low that can strip unnecessary words from your sentence. Then you can run a query with less than and greater than operator in Firestore.
While storing your data also, you'll have to make sure that you hash the text before storing it, and store the plain text also as if you change the plain text the hashed value will also change.