I have data stored in Amazon S3 bucket in parquet file format.
I want this data to be copied from S3 to Amazon Redshift, so I use copy commands to achieve this. But, I need to do this manually. I want to achieve this with some sort of automation such that if any new file comes into S3, it should be copied to the required table in redshift. Can you suggest what different approaches I can use?
Hello Aditya, I haven't tried this myself, but theoretically you can harness the power of Amazon s3 events to generate an event whenever there is a CRUD event on your parquet files in S3. First you'd have to generate the event: https://docs.aws.amazon.com/AmazonS3/latest/userguide/NotificationHowTo.html
Then you would want to subscribe to the event use the event info to pull the file into Redshift https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-event-notifications.html#working-with-event-notifications-subscribe
-Scott