Avatar of Umair Iftikhar

Umair Iftikhar

Technical Architect at ERP Studio
Avatar of umairumi7863381
Technical Architect at ERP Studio·
Needs advice
on
Amazon DynamoDBAmazon DynamoDB
and
MongoDBMongoDB

We are developing a system in which we have to collect 10 Million records every day. We need a database solution, NoSQL. data is simple logs. We are using AWS for now. I want to know the cheaper solution from both available techs. Amazon S3 or MongoDB.

We have 30 Tables that are collecting these logs.

READ MORE
6 upvotes·26.1K views
Replies (4)
Avatar of ivbeg
Founder - Dateno, Director - NGO "Informational Culture" / Ambassador - OKFN Armenia at Infoculture·

I am a big fan of MongoDB and It's great for document storage but I am not really sure that it's the best engine for log storage. If data that you store is "flat" and well-defined than log storage based on engines like Clickhouse or Elasticsearch stach could be much more efficient. Also it's quite important how you reuse collected logs. Do you calculate aggregated metrics? Do you need full search ? And so on.

If logs are really simple and full text search needed than Logstash + Elasticsearch. If you need to calculate a lot of metrics and logs are not just text, but include numbers/values needed for aggregation than Clickhouse.

READ MORE
5 upvotes·24.9K views
Avatar of subz390
Developer ·

The way I'd approach this is to carry out a survey. Prioritise a list of important criteria, such as performance, functionality, and cost. For example with MongoDB you can archive documents if the data not immediately required to save on costs at the expense of instant access, but if that fits your use case model then you can use that feature. So create a use case test project that actually uses both services as per your use case and see for yourself the results of the tests. Along the way you'll encounter issues perculiar to each platform that you can factor into your final decision, such as comparing how easy it is to use their API, or that the documentation is sparce or confusing. From there you'll have an informed decision and you'll be confident investing further resources into it.

READ MORE
5 upvotes·1 comment·26.2K views
reidmorrison
reidmorrison
·
January 9th 2022 at 5:20PM

If you use Amazon DocumentDB instead of DynamoDB, it is compatible with the MongoDB API. That will keep your code cloud agnostic and you have option of switching between DynamoDB and MongoDB in the future based on whichever ends up being cheapest to run.

·
Reply
View all (4)
Avatar of umairumi7863381
Technical Architect at ERP Studio·
Recommends
on
PostgreSQLPostgreSQL

It is open-source and more tools than mySQL. PostgreSQL is an object-relational database management system (ORDBMS) with an emphasis on extensibility and standards compliance. It is also good for small companies due to tools for free availability. PostgreSQL includes built-in support for regular B-tree and hash indexes. Indexes in PostgreSQL also support Expression & Partial Indices ( index only a part of a table). Expression Index can be created with an index of the result of an expression or function, instead of simply the value of a column.

READ MORE
4 upvotes·374.9K views
Avatar of umairumi7863381
Technical Architect at ERP Studio·
Needs advice
on
CassandraCassandraDruidDruid
and
TimescaleDBTimescaleDB

Developing a solution that collects Telemetry Data from different devices, nearly 1000 devices minimum and maximum 12000. Each device is sending 2 packets in 1 second. This is time-series data, and this data definition and different reports are saved on PostgreSQL. Like Building information, maintenance records, etc. I want to know about the best solution. This data is required for Math and ML to run different algorithms. Also, data is raw without definitions and information stored in PostgreSQL. Initially, I went with TimescaleDB due to PostgreSQL support, but to increase in sites, I started facing many issues with timescale DB in terms of flexibility of storing data.

My major requirement is also the replication of the database for reporting and different purposes. You may also suggest other options other than Druid and Cassandra. But an open source solution is appreciated.

READ MORE
3 upvotes·452.7K views
Replies (1)
Recommends
on
MongoDB
MongoDB

Hi Umair, Did you try MongoDB. We are using MongoDB on a production environment and collecting data from devices like your scenario. We have a MongoDB cluster with three replicas. Data from devices are being written to the master node and real-time dashboard UI is using the secondary nodes for read operations. With this setup write operations are not affected by read operations too.

READ MORE
6 upvotes·1 comment·70.5K views
Don Bizzell
Don Bizzell
·
February 9th 2022 at 5:05PM

You might want to look at Yugabyte DB it is open source, scalable, and can do geographicly distributed clusters. Best of all it fully supports Postgress, so you may not have to change anything but a driver.

·
Reply