
Image by editor
. Introduction
From database exports to the API response, the spreadsheet downloads, CSV files are available everywhere in data workflows. Although Pandas works very well, sometimes you need a quick solution that you can codes using Pandas without installing.
With the understanding of the list and the generator’s comments, the built -in -in -CSV module code can handle most common CSV tasks in the same line of the CSV module code. This One Liner is the best Perfect of searching for instant data, ETL debugging, or when you are working in a compulsive environment where there are no external libraries available.
Let’s use Sample Business Dataset with 50 Records: Data CSV And start!
🔗 🔗 Link from the code on the Gut Hub
. 1. Find a collection of columns
Calculate the total of any numerical column in all rows.
print(f"Total: ${sum(float(r(3)) for r in __import__('csv').reader(open(path)) if r(0) != 'transaction_id'):,.2f}")Here, path The variable is the sample CSV file. For this example, in Google Kolab, it is path = "/content/data.csv".
Output:
Here, __import__('csv') The built -in CSV module imports the inline. The generator leaves the header row, converts column values into floats, summarizes them, and changes formats with currency indicators. Adjust the column index (3) and the header check as needed.
. 2. Maximum group
Find out which group of overall cost in your dataset.
print(max({r(5): sum(float(row(3)) for row in __import__('csv').reader(open(path)) if row(5) == r(5) and row(0) != 'transaction_id') for r in __import__('csv').reader(open(path)) if r(0) != 'transaction_id'}.items(), key=lambda x: x(1)))Output:
('Mike Rodriguez', 502252.0)The understanding group of the dictionary through Column 5, columns for each group summarize 3 values. One pass group collects keys and holds another. max() Lambda is the most important total. Adjust the column index for various group by -operations.
. 3. Display all sets of filters and rows
Just show rows that match a specific condition with formated output.
print("\n".join(f"{r(1)}: ${float(r(3)):,.2f}" for r in __import__('csv').reader(open(path)) if r(7) == 'Enterprise' and r(0) != 'transaction_id'))Output:
Acme Corp: $45,000.00
Gamma Solutions: $78,900.00
Zeta Systems: $156,000.00
Iota Industries: $67,500.25
Kappa LLC: $91,200.75
Nu Technologies: $76,800.25
Omicron LLC: $128,900.00
Sigma Corp: $89,700.75
Phi Corp: $176,500.25
Omega Technologies: $134,600.50
Alpha Solutions: $71,200.25
Matrix Systems: $105,600.25The generator expresses the rows where the column is equal to 7 EnterpriseThen columns 1 and 3 format. Using "\n".join(...) Avoid printing list None Values.
. 4. Group by group distribution
Get tomorrow for every unique value in the grouping column.
print({g: f"${sum(float(row(3)) for row in __import__('csv').reader(open(path)) if row(6) == g and row(0) != 'transaction_id'):,.2f}" for g in set(row(6) for row in __import__('csv').reader(open(path)) if row(0) != 'transaction_id')})Output:
{'Asia Pacific': '$326,551.75', 'Europe': '$502,252.00', 'North America': '$985,556.00'}The understanding of the dictionary first removes the column 6 from unique values using a set understanding, then calculates the combination of column 3 for each group. This memory is effective because of the generator’s impressions. Convert the column index into a group through different fields.
. 5. Thrushold filter with the sequence
Find and classify all the records above a particular numerical doorstep.
print(((n, f"${v:,.2f}") for n, v in sorted(((r(1), float(r(3))) for r in list(__import__('csv').reader(open(path)))(1:) if float(r(3)) > 100000), key=lambda x: x(1), reverse=True)))Output:
(('Phi Corp', '$176,500.25'), ('Zeta Systems', '$156,000.00'), ('Omega Technologies', '$134,600.50'), ('Omicron LLC', '$128,900.00'), ('Matrix Systems', '$105,600.25'))It filters rows where the column is more than 3 100000Name and numerical value creates topuses, arranges through numerical value, and then forms values as a currency for display. Adjust the doorsteps and columns as needed.
. 6. Counting of unique values
Quickly decide how many different values are in any column.
print(len(set(r(2) for r in __import__('csv').reader(open(path)) if r(0) != 'transaction_id')))Output:
Here, the set understanding columns 2 out of unique values. len() Counts them. This data is useful for checking diversity or finding a separate category.
. 7. Conditional deposit
Calculate the average or other statistics for specific subtits of your data.
print(f"Average: ${sum(float(r(3)) for r in __import__('csv').reader(open(path)) if r(6) == 'North America' and r(0) != 'transaction_id') / sum(1 for r in __import__('csv').reader(open(path)) if r(6) == 'North America' and r(0) != 'transaction_id'):,.2f}")Output:
This one liner calculates an average of column 3 for rows found in column 6. It uses the amount distributed by a counting (generator expression). It reads the file twice but keeps the use of memory.
. 8. Multi -column filter
Apply multiple filter conditions simultaneously in different columns.
print("\n".join(f"{r(1)} | {r(2)} | ${float(r(3)):,.2f}" for r in __import__('csv').reader(open(path)) if r(2) == 'Software' and float(r(3)) > 50000 and r(0) != 'transaction_id'))Output:
Zeta Systems | Software | $156,000.00
Iota Industries | Software | $67,500.25
Omicron LLC | Software | $128,900.00
Sigma Corp | Software | $89,700.75
Phi Corp | Software | $176,500.25
Omega Technologies | Software | $134,600.50
Nexus Corp | Software | $92,300.75
Apex Industries | Software | $57,800.00It connects multiple conditions of the filter and Forms output with pipe separator for operators, string equations and numerical comparisons, and clean display.
. 9. Compute column statistics
Minimum, maximum and average statistics for numerical columns in a shot.
vals = (float(r(3)) for r in __import__('csv').reader(open(path)) if r(0) != 'transaction_id'); print(f"Min: ${min(vals):,.2f} | Max: ${max(vals):,.2f} | Avg: ${sum(vals)/len(vals):,.2f}"); print(vals)Output:
Min: $8,750.25 | Max: $176,500.25 | Avg: $62,564.13
(45000.0, 12500.5, 78900.0, 23400.75, 8750.25, 156000.0, 34500.5, 19800.0, 67500.25, 91200.75, 28750.0, 43200.5, 76800.25, 15600.75, 128900.0, 52300.5, 31200.25, 89700.75, 64800.0, 22450.5, 176500.25, 38900.75, 27300.0, 134600.5, 71200.25, 92300.75, 18900.5, 105600.25, 57800.0)This column produces a list of 3 -digit values, then calculates the minimum, maximum and average in one line. Semicolone separates statements. This is far more memory than streaming, but for these statistics is faster than reading more than one file.
. 10. Export filtered data
Create a new CSV file that only has rows that meet your standards.
__import__('csv').writer(open('filtered.csv','w',newline="")).writerows((r for r in list(__import__('csv').reader(open(path)))(1:) if float(r(3)) > 75000))It reads CSV, filters rows on a condition, and writes them in a new file. newline="" Parameter prevents additional line intervals. Note that this example leaves the header (it uses (1:)), So add it clearly if you need a header in the output.
Wrap
I hope you will be helpful for your CSV processing.
Such one -liner are easy to work:
- Instant data search and verification
- Easy data change
- Prototing before writing a full script
But you should avoid them:
- Production data processing
- Files need to deal with complex error
- Multi -dimensional change
These techniques work with the built -in CSV module of azagar when you need a quick solution without a set -up overhead. Happy Analysis!
Pray Ca Is a developer and technical author from India. She likes to work at the intersection of mathematics, programming, data science, and content creation. The fields of interest and expertise include dupas, data science, and natural language processing. She enjoys reading, writing, coding and coffee! Currently, they are working with the developer community to learn and share their knowledge with the developer community by writing a lesson, how to guide, feed and more. The above resources review and coding also engages lessons.