blog_posts: 44f5812787
This data as json
id | createdDate | title | link | postExcerpt | featuredImageUrl | hash | contributors | modifiedDate | displayDate |
---|---|---|---|---|---|---|---|---|---|
blog-posts#33-10049 | 2020-05-14 18:58:09 | Optimize memory management in AWS Glue | https://aws.amazon.com/blogs/big-data/optimize-memory-management-in-aws-glue/ | In this post, we discuss a number of techniques to enable efficient memory management for Apache Spark applications when reading data from Amazon S3 and compatible databases using a JDBC connector. We describe how Glue ETL jobs can utilize the partitioning information available from AWS Glue Data Catalog to prune large datasets, manage large number of small files, and use JDBC optimizations for partitioned reads and batch record fetch from databases. You can use some or all of these techniques to help ensure your ETL jobs perform well. | https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2020/05/14/Glue_Grouping1-300x148.png | 44f5812787 | Mohit Saxena | 2022-12-05 17:55:42 | 14 May 2020 |
Links from other tables
- 9 rows from blog_post_hash in blog_post_tags