Jupyter

Jupyter

Utilities / Application Utilities / Data Science Notebooks
Needs advice
on
DruidDruidKafkaKafka
and
Apache SparkApache Spark

My process is like this: I would get data once a month, either from Google BigQuery or as parquet files from Azure Blob Storage. I have a script that does some cleaning and then stores the result as partitioned parquet files because the following process cannot handle loading all data to memory.

The next process is making a heavy computation in a parallel fashion (per partition), and storing 3 intermediate versions as parquet files: two used for statistics, and the third will be filtered and create the final files.

I make a report based on the two files in Jupyter notebook and convert it to HTML.

  • Everything is done with vanilla python and Pandas.
  • sometimes I may get a different format of data
  • cloud service is Microsoft Azure.

What I'm considering is the following:

Get the data with Kafka or with native python, do the first processing, and store data in Druid, the second processing will be done with Apache Spark getting data from apache druid.

the intermediate states can be stored in druid too. and visualization would be with apache superset.

READ MORE
5 upvotes·177.9K views
Needs advice
on
AtomAtomJupyterJupyter
and
PyCharmPyCharm

I am learning Python coding and doing lots of hands on python problem. I like the feel of Jupyter notebook but I have concern will that slow my computer performance. Will PyCharm or Jupyter or Atom-IDE is good for python coding?

READ MORE
2 upvotes·304.2K views
Replies (1)
Managing Director at DEEPSITE LIMITED·
Recommends
on
PyCharm

It is a full featured IDE, refactoring and debugging are very powerful in PyCharm, compare to other two.

READ MORE
3 upvotes·1 comment·286 views
Mateusz Kania
Mateusz Kania
·
September 28th 2021 at 3:32PM

I upvoted Xiaoping Hu answer. Although for me PyCharm is a little bit overwhelming. I think that I will probably will go back to it, once I have firm python background. Now I am using Neovim + python terminal. :) I believe it is harder, but more rewarding in the long term. (Learning vim altogether with python). And for sure it does not slow down the machine. :D

I want to just add, that I am fairly ok in r language (I was often using Rstudio - I see PyCharm as its equivalent for Python), so I know my way around high level language.

·
Reply