You could check out MongoDB Atlas Data lake: https://www.mongodb.com/blog/post/mongodb-atlas-data-lake-now-generally-available
It can allow you to upload the json file into Atlas and then you can run queries on it using the MongoDB Query Language.
You could check out MongoDB Atlas Data lake: https://www.mongodb.com/blog/post/mongodb-atlas-data-lake-now-generally-available
It can allow you to upload the json file into Atlas and then you can run queries on it using the MongoDB Query Language.
As you mentioned that you are going to deal with lot of JSON or XML format in your app. Then I would suggest you to go for the MongoDB. It is very easy to maintain JSON records in mongodb and you can store XML as a string there. MongoDB will help to traverse lot of JSON/BSON records within fraction of seconds.
Oracle provides CLOB option which can be clubbed to JSON/XML constraints and works great. Hope this helps!
Hi. Currently, I have a requirement where I have to create a new JSON file based on the input CSV file, validate the generated JSON file, and upload the JSON file into the application (which runs in AWS) using API. Kindly suggest the best language that can meet the above requirement. I feel Python will be better, but I am not sure with the justification of why python. Can you provide your views on this?
Python is very flexible and definitely up the job (although, in reality, any language will be able to cope with this task!). Python has some good libraries built in, and also some third party libraries that will help here. 1. Convert CSV -> JSON 2. Validate against a schema 3. Deploy to AWS
import csv
import json
with open("your_input.csv", "r") as f:
csv_as_dict = list(csv.DictReader(f))[0]
with open("your_output.json", "w") as f:
json.dump(csv_as_dict, f)
The validation part is handled nicely by this library: https://pypi.org/project/jsonschema/ It allows you to create a schema and check whether what you have created works for what you want to do. It is based on the json schema standard, allowing annotation and validation of any json
It as an AWS library to automate the upload - or in fact do pretty much anything with AWS - from within your codebase: https://aws.amazon.com/sdk-for-python/ This will handle authentication to AWS and uploading / deploying the file to wherever it needs to go.
A lot depends on the last two pieces, but the converting itself is really pretty neat.
I would use Go. Since CSV files are flat (no hierarchy), you could use the encoding/csv package to read each row, and write out the values as JSON. See https://medium.com/@ankurraina/reading-a-simple-csv-in-go-36d7a269cecd. You just have to figure out in advance what the key is for each row.
Hi all, I am trying to decide on a database for time-series data. The data could be tracking some simple series like statistics over time or could be a nested JSON (multi-level nested). I have been experimenting with InfluxDB for the former case of a simple list of variables over time. The continuous queries are powerful too. But for the latter case, where InfluxDB requires to flatten out a nested JSON before saving it into the database the complexity arises. The nested JSON could be objects or a list of objects and objects under objects in which a complete flattening doesn't leave the data in a state for the queries I'm thinking.
[
{ "timestamp": "2021-09-06T12:51:00Z",
"name": "Name1",
"books": [
{ "title": "Book1", "page": 100 },
{ "title": "Book2", "page": 280 },
]
},
{ "timestamp": "2021-09-06T12:52:00Z",
"name": "Name2",
"books": [
{ "title": "Book1", "page": 320},
{ "title": "Book2", "page": 530 },
{ "title": "Book3", "page": 150 },
]
}
]
Sample query: With a time range, for name xyz, find all the book title for which # of page < 400.
If I flatten it completely, it will result in fields
like books_0_title
, books_0_page
, books_1_title
, books_1_page
, ... And by losing the nested context it will be hard to return one field (title) where some condition for another field (page) satisfies.
Appreciate any suggestions. Even a piece of generic advice on handling the time-series and choosing the database is welcome!
Can I create reusable ARM templates (JSON files) in the Bit community? I see examples of components made from React codes. How can I make the same using JSON files?
Searching for a tool (library?) to build a big document browser/search, something like a bible browser with selecting chapters and paragraphs. Navigation should be tree-based, and searching in files (the content is split into several JSON) would be a nice addition. Searching can be server-side (i.e., PHP) with JavaScript frontend for AJAX loading. Can someone point me in the right direction?
I recommend checking out Algolia.
They have a very affordable entry-level plan and even a small, free level plan for new websites.
Their JavaScript API is pretty simple to implement as well.
I’d be happy to help set this up for you if you would like some help. I am booked through middle of February but I open up later next month.
Have you explored ElasticSearch so far? You can build out simple PHP interface or use any of the CMS (wordpress) for your front-end
We are building cloud based analytical app and most of the data for UI is supplied from SQL server to Delta lake and then from Delta Lake to Azure Cosmos DB as JSON using Databricks. So that API can send it to front-end. Sometimes we get larger documents while transforming table rows into JSONs and it exceeds 2mb limit of cosmos size. What is the best solution for replacing Cosmos DB?
Thanks for the input Ivan Reche. If we store big documents to blob container then how will python API's can query those and send it to UI? and if any updates happen on UI, then API has to write those changes back to big documents as copy.
Do you know what the max size of one of your documents might be? Mongo (which you can also use on Azure) allows for larger sized documents (I think maybe 20MB). With that said, I ran into this issue when I was first using Cosmos, and I wound up rethinking the way I was storing documents. I don't know if this is an option for your scenario, but I ended up doing was breaking my documents up into smaller subdocuments. A thought process that I have come to follow is that if any property is an array (or at least can be an array with a length of N), make that array simple a list of IDs that point to other documents.
Aerospike might be one to check out. Can store 8Mb objects and provides much better performance and cost effectiveness compared with Cosmos and Mongo.
Hello All, I'm building an app that will enable users to create documents using ckeditor or TinyMCE editor. The data is then stored in a database and retrieved to display to the user, these docs can contain image data also. The number of pages generated for a single document can go up to 1000. Therefore by design, each page is stored in a separate JSON. I'm wondering which database is the right one to choose between ArangoDB and PostgreSQL. Your thoughts, advice please. Thanks, Kashyap
It depends on the rest of your application/infrastructure. First would you use the features provided by the graph storage?
If not in terms of performance PostgreSQL is very good (even better than most no-sql db) for storing static JSON. If your JSON documents have to be updated frequently MongoDB could be an option as well.
Hello Jean, The application's main utility is to create and update documents therefore the choice for a database that supports json. I wouldn't go the MongoDB route due to past bad experience and licensing restrictions compared to an open source db.
I have a project (in production) that a part of it is generating HTML from JSON object normally we use Microsoft SQL Server only as our main database. but when it comes to this part some team members suggest working with a NoSQL database as we are going to handle JSON data for both retrieval and querying. others replied that will add complexity and we will lose SQL Servers' Unit Of Work which will break the Atomic behavior, and they suggest to continue working with SQL Server since it supports working with JSON. If you have practical experience using JSON with SQL Server, kindly share your feedback.
I agree with the advice you have been given to stick with SQL Server. If you are on the latest SQL Server version you can query inside the JSON field. You should set up a test database with a JSON field and try some queries. Once you understand it and can demonstrate it, show it to the other developers that are suggesting MongoDB. Once they see it working with their own eyes they may drop their position of Mongo over SQL. I would only seriously consider MongoDB if there was no other SQL requirements. I wouldn't do both. I'd be all SQL or all Mongo.
I think the key thing to look for is what kind of queries you're expecting to do on that JSON and how stable that data is going to be. (And if you actually need to store the data as JSON; it's generally pretty inexpensive to generate a JSON object)
MongoDB gets rid of the relational aspect of data in favor of data being very fluid in structure.
So if your JSON is going to vary a lot/is unpredictable/will change over time and you need to run queries efficiently like 'records where the field x exists and its value is higher than 3', that's a great use case for MongoDB.
It's hard to solve this in a standard relational model: Indexing on a single column that has wildly different values is pretty much impossible to do efficiently; and pulling out the data in its own columns is hard because it's hard to predict how many columns you'd have or what their datatypes would be. If this sounds like your predicament, 100% go for MongoDB.
If this is always going to be more or less the same JSON and the fields are going to be predictably the same, then the fact that it's JSON doesn't particularly matter much. Your indexes are going to approach it similar to a long string.
If the queried fields are very predictable, you should probably consider storing the fields as separate columns to have better querying capabilities. Ie if you have {"x":1, "y":2}, {"x":5, "y":6}, {"x":9, "y":0} - just make a table with an x and y column and generate the JSON. The CPU hit is worth it compared to the querying capabilities.
Hi Blacknight. If your single motivation is to store JSON, don't bother and continue with SQL Server.
When it comes to MongoDB, the true power is getting out of the standard relational DB thinking (a MongoDB collection is very different to a SQL Server table).
It takes a while to shift, but when you have and you realise the power and freedom you get (to basically store the data in the most adhoc form for your need), you'll never go back to SQL Server and relational.