Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
1802sub_processor.py		1802sub_processor.py
1802test.ipynb		1802test.ipynb
README.md		README.md
processor.py		processor.py
sub_processor1802.py		sub_processor1802.py

Repository files navigation

notebook

About

This project is a simplified demonstration of a real-world big data project, showcasing efficient handling of large-scale datasets using PySpark and Apache Iceberg. This repository includes 5 sample datasets to replicate real-world scenarios, focusing on memory and time optimization while writing tables in Iceberg

Readme