Blocks#
Blocks provides a simple interface to read, organize, and manipulate structured data in files on local and cloud storage
Install#
pip install sq-blocks
Features#
import blocks
# Load one or more files with the same interface
df = blocks.assemble('data.csv')
train = blocks.assemble('data/*[0-7].csv')
test = blocks.assemble('data/*[89].csv')
# With direct support for files on GCS
df = blocks.assemble('gs://mybucket/data.csv')
df = blocks.assemble('gs://mybucket/data/*.csv')
The interface emulates the tools you’re used to from the command line, with full support for globbing and pattern matching. And blocks can handle more complicated structures as your data grows in complexity:
Layout |
Recipe |
---|---|
blocks.assemble('data/**')
|
|
blocks.assemble('data/g1/*')
|
|
blocks.assemble('data/*/part_01.pq')
|
|
blocks.assemble('data/g[124]/part_01.pq')
|