For data science (free 7 day mini course)

Photo by Editor | Chat GPT

. Introduction

Welcome For data science, a free 7 day mini course For early people! If you are starting from data science or want to learn basic skills, this is the initial friendly course for you. In the next seven days, you will only learn how to work on the data task using the core.

How will you learn:

Work with the basic Uzar data structure
Clear and develop messy text data
Summary and group data with dictionaries (exactly as you do in SQL or Excel)
Write reusable functions that keep your code clean and efficient
Handle the mistakes beautifully so that your scripts don’t crash on dirty input data
And finally, you will create an easy data profiling tool to inspect any CSV dataset

Let’s start!

🔗 🔗 Link from the code on the Gut Hub

. First Day: Variables, Data Types, and File I/O

In data science, everything starts with raw data: survey response, logs, spreadsheets, forms, scrapped websites, etc. Before you can make or analyze anything, you need:

Load the data
Understand its shape and types
Begin to clean or start inspecting it

Today, you will learn:

Types of basic Uzar data
How to read and write raw .txt files

!! 1. Various

In the meantime, the variable is a nominal reference. In terms of data, you can think of them as fields, columns, or metadata.

filename = "responses.txt"
survey_name = "Q3 Customer Feedback"
max_entries = 100

!! 2. Data Types You will often use

Don’t worry about the unclear types yet. You will use most of the following:

Type of azagar	What is used for this	Example
strap	Names of raw text, column	“Age”, “unknown”
Intim	Counting, variable variants	42, 0, -3
Float	Continuously variable	3.14, 0.0, -100.5
Bice	The flag / binary results	True, wrong
None	Lost/banned values	None

When you are dealing with everyone – and how to check or replace them – it is to know that the steps to clean the data are zero.

!! 3. File input: Reading raw data

Most real -world data is .txt, .csv, or .log files. You often need to load them line byline, not together (especially if big).

We say you have a file responses.txt:

Here’s how do you read it:

with open("responses.txt", "r") as file:
    lines = file.readlines()

for i, line in enumerate(lines):
    cleaned = line.strip()  # removes \n and spaces
    print(f"{i + 1}: {cleaned}")

Output:

1: Yes
2: No
3: Yes
4: Maybe
5: No

!! 4. File output: Write processed data

We say you just want to save “yes” responses on a new file:

with open("responses.txt", "r") as infile:
    lines = infile.readlines()

yes_responses = ()

for line in lines:
    if line.strip().lower() == "yes":
        yes_responses.append(line.strip())

with open("yes_only.txt", "w") as outfile:
    for item in yes_responses:
        outfile.write(item + "\n")

This filter is a very simple version of the transform-cyv-pipeline, a concept that is used daily in data pre-processing.

!! ⏭ Exercise: Write your first data script

Create a file that says survey.txt And copy to the following lines:

Now write a tiger script that:

File reads
Counts how often “yes” appears (case unnoticed) You will learn to work with the wires later in the text. But let him go!
Counting prints
The clean version of the data writes (Capitalized, no White Space) cleaned_survey.txt

. Day 2: Basic Data structures

Data science is about to manage and create data so it can be made clean, analyzed or model. Today you will learn four essential data structures in Corezagar and how to use them for actual data tasks:

List: For a row series
Tupile: For Fixed Position Record
DOCT: For labeled data (such as column)
Set: To track unique values

!! 1. List: For the Data Queue series

Lists are the most flexible and ordinary structure, which are suitable for representation:

A column of values
A combination of records
Unknown size dataset

For example: Read the values from a file in the list.

with open("scores.txt", "r") as file:
    scores = (float(line.strip()) for line in file)

print(scores)

These prints:

Now you can:

average = sum(scores) / len(scores)
print(f"Average score: {average:.2f}")

Output:

!! 2. Topal: For the Fixed Structure Records

Tapes are like lists, but unacceptable and excellent use for unknown structure rows, such as, (name, age).

Example: Read the names and ages file.
Suppose we have the following people.txt:

Alice, 34
Bob, 29
Eve, 41

Now read in the contents of the file:

with open("people.txt", "r") as file:
    records = ()
    for line in file:
        name, age = line.strip().split(",")
        records.append((name.strip(), int(age.strip())))

Now you can access fields by position:

for person in records:
    name, age = person
    if age > 30:
        print(f"{name} is over 30.")

!! 3. DOCT: For labeled data (such as column)

Dictionaries store key value couples, the nearest thing in the basic age that has a table row with designated columns.

For example: Change each person’s records in a duct:

people = ()

with open("people.txt", "r") as file:
    for line in file:
        name, age = line.strip().split(",")
        person = {
            "name": name.strip(),
            "age": int(age.strip())
        }
        people.append(person)

Now your data is too much readable and flexible:

for person in people:
    if person("age") < 60:
        print(f"{person('name')} is perhaps a working professional.")

!! 4. Set: for individuality and fast membership checks

Set automatically remove copies. So the sets are very good:

Counting of unique varieties
Checking whether a price has been seen before
Detect separate values without order

Example: Find all unique domains from the emails file.

domains = set()

with open("emails.txt", "r") as file:
    for line in file:
        email = line.strip().lower()
        if "@" in email:
            domain = email.split("@")(1)
            domains.add(domain)

print(domains)

Output:

{'gmail.com', 'yahoo.com', 'example.org'}

!! ⏭ Exercise: A mini -data inspector code

Create a file that says dataset.txt With the following content:

Now write a tiger script that:

Reads every line and stores it as a dictionary with keys: Name, age, character
Counts how many people are in each character (use the dictionary) and the number of unique ages (use a set)

. Day 3: Working with strings

Most of the real -world datases are available everywhere.

Today, you will learn:

Clear and standardize raw text
Extract information from stars
Make easy text -based features (the way you can use for filtering or modeling)

!! 1. The basic wire cleaning

We say you get this raw list of job titles from CSV:

titles = (
    "  Data Scientist\n",
    "data scientist",
    "Senior Data Scientist ",
    "DATA scientist",
    "Data engineer",
    "Data Scientist"
)

Your work? Make it normal.

cleaned = (title.strip().lower() for title in titles)

Now everything is small and white space free.

Output:

('data scientist', 'data scientist', 'senior data scientist', 'data scientist', 'data engineer', 'data scientist')

!! 2. Standard values

We say you are only interested in identifying data scientists.

standardized = ()

for title in cleaned:
    if "data scientist" in title:
        standardized.append("data scientist")
    else:
        standardized.append(title)

!! 3. Counting words, checking samples

Useful text features:

The number of words
Whether a wire is a keyword
Whether String is a number or email

Example:

text = " The price is $5,000!  "

# Clean up
clean = text.strip().lower().replace("$", "").replace(",", "").replace("!", "")
print(clean)  

# Word count
word_count = len(clean.split())

# Contains digit
has_number = any(char.isdigit() for char in clean)

print(word_count)
print(has_number)

Output:

"the price is 5000"
4
True

!! 4. Divide the parts and extract

Let’s take the example of email:

email = "  Alice.Johnson@Example.com  "
email = email.strip().lower()

username, domain = email.split("@")

print(f"User: {username}, Domain: {domain}")

These prints:

User: alice.johnson, Domain: example.com

This type of extraction is used in the user’s behavior analysis, spam detection and so on.

!! 5. Detecting specimens specified text

You do not need to express regular pattern checks.

For example: Check whether someone has mentioned “Azigar” in response to the free text:

comment = "I'm learning Python and SQL for data jobs."

if "python" in comment.lower():
    print("Mentioned Python")

!! ⏭ Exercise: Clean survey comments

Create a file that says comments.txt With the following lines:

Great course! Loved the pacing.
Not enough Python examples.
Too basic for experienced users.
python is exactly what I needed!
Would like more SQL content.
Excellent – very beginner-friendly.

Now write a tiger script that:

Clears each comment (strip, lower case, remove the punctuation)
Prints the total number of comments, how many “Azgar” mention, and the average word count per comment

. Day 4: Summary with group, counting, and dictionary

You have used the duct to store the label recorded record. Today, you will go to a level deep level: the use of the dictionary to summarize groups, counting, and data – such as the axis table or group in the SQL.

!! 1. Grouping by a field

We say you have this data.

data = (
    {"name": "Alice", "city": "London"},
    {"name": "Bob", "city": "Paris"},
    {"name": "Eve", "city": "London"},
    {"name": "John", "city": "New York"},
    {"name": "Dana", "city": "Paris"},
)

Purpose: How many people are in every city.

city_counts = {}

for person in data:
    city = person("city")
    if city not in city_counts:
        city_counts(city) = 1
    else:
        city_counts(city) += 1

print(city_counts)

Output:

{'London': 2, 'Paris': 2, 'New York': 1}

!! 2. A field summary in terms of category

Now we have to say:

salaries = (
    {"role": "Engineer", "salary": 75000},
    {"role": "Analyst", "salary": 62000},
    {"role": "Engineer", "salary": 80000},
    {"role": "Manager", "salary": 95000},
    {"role": "Analyst", "salary": 64000},
)

Purpose: Calculate tomorrow and average salary.

totals = {}
counts = {}

for person in salaries:
    role = person("role")
    salary = person("salary")
    
    totals(role) = totals.get(role, 0) + salary
    counts(role) = counts.get(role, 0) + 1

averages = {role: totals(role) / counts(role) for role in totals}

print(averages)

Output:

{'Engineer': 77500.0, 'Analyst': 63000.0, 'Manager': 95000.0}

!! 3. Frequency table (detection of mode)

Find the most common age in Datasit:

ages = (29, 34, 29, 41, 34, 29)

freq = {}

for age in ages:
    freq(age) = freq.get(age, 0) + 1

most_common = max(freq.items(), key=lambda x: x(1))

print(f"Most common age: {most_common(0)} (appears {most_common(1)} times)")

Output:

Most common age: 29 (appears 3 times)

!! ⏭ Exercise: Analyze the employee Dataset

Create a file employees.txt With the following content:

Alice,London,Engineer,75000
Bob,Paris,Analyst,62000
Eve,London,Engineer,80000
John,New York,Manager,95000
Dana,Paris,Analyst,64000

Write a Uzar script that:

Loads data into a dictionary list
Prints the number of employees per city and per character prints average salary

. 5 days: written functions

You have written the code that loads, clears, filters, and summarizes the data. Now you will pack this logic into functions, so you can do:

Reuse your code
Construction of processing pipelines
Keep the script worth reading and checking

!! 1. Cleaning the inputs of the text

Let’s write a function to clean the basic text:

def clean_text(text):
    return text.strip().lower().replace(",", "").replace("$", "")

Now you can apply it to every field reading from the file.

!! 2. Making a row record

Next, here is a simple function to analyze and record each row in a file:

def parse_row(line):
    parts = line.strip().split(",")
    return {
        "name": parts(0),
        "city": parts(1),
        "role": parts(2),
        "salary": int(parts(3))
    }

Now your file becomes loading:

with open("employees.txt") as file:
    rows = (parse_row(line) for line in file)

!! 3. The collector

So far, you have counted on average and events. Let’s write some basic helper functions for this:

def average(values):
    return sum(values) / len(values) if values else 0

def count_by_key(data, key):
    counts = {}
    for item in data:
        k = item(key)
        counts(k) = counts.get(k, 0) + 1
    return counts

!! ⏭ Exercise: Modify the previous work

Reactor Total resolution in reusable functions:

load_data(filename)
average_salary_by_role(data)
count_by_city(data)

Then use them in the script that prints out just like 4 days.

. Day 6: Reading, writing, and handling the fundamental error

Data files are often incomplete, bad and incorrectly shaped. So how would you treat them?

Today you will learn:

How to read and write structural files
How to handle the mistakes beautifully
How to drop or log in without accidental rows

!! 1. Reading secure file

What happens when you try to read a file that does not exist? If the file is not available, you should open the file and catch the “Filenot Founder error”.

try:
    with open("employees.txt") as file:
        lines = file.readlines()
except FileNotFoundError:
    print("Error: File not found.")
    lines = ()

!! 2. Handle the bad rows beautifully

Now let’s try to leave a bad row and just take action on the full rows.

records = ()

for line in lines:
    try:
        parts = line.strip().split(",")
        if len(parts) != 4:
            raise ValueError("Incorrect number of fields")
        record = {
            "name": parts(0),
            "city": parts(1),
            "role": parts(2),
            "salary": int(parts(3))
        }
        records.append(record)
    except Exception as e:
        print(f"Skipping bad line: {line.strip()} ({e})")

!! 3. Write clean data in a file

Finally, let’s write clear data in a file.

with open("cleaned_employees.txt", "w") as out:
    for r in records:
        out.write(f"{r('name')},{r('city')},{r('role')},{r('salary')}\n")

!! ⏭ Exercise: Make error tolerant loader

Create a file raw_EMPLYEEES.TXT with some incomplete or dirty lines: such as: such as:

Alice,London,Engineer,75000
Bob,Paris,Analyst
Eve,London,Engineer,eighty thousand
John,New York,Manager,95000

Write a script that:

The load only correct records
Print the number of valid rows
Writes them validated_employees.txt

. Day 7: Create a mini -data profile (Project Day)

It has a great job of making it so far. Today, you will create a standstone who is the script that:

Loads a CSV file
Detects the names and types of the column
Counts useful stats
Writes a summary report

!! Step by step

1. Load file:

def load_csv(filename):
    with open(filename) as f:
        lines = (line.strip() for line in f if line.strip())
    header = lines(0).split(",")
    rows = (line.split(",") for line in lines(1:))
    return header, rows

2. Find the types of columns:

def detect_type(value):
    try:
        float(value)
        return "numeric"
    except:
        return "text"

3. Each column’s profile:

def profile_columns(header, rows):
    summary = {}
    for i, col in enumerate(header):
        values = (row(i).strip() for row in rows if len(row) == len(header))
        col_type = detect_type(values(0))
        unique = set(values)
        summary(col) = {
            "type": col_type,
            "unique_count": len(unique),
            "most_common": max(set(values), key=values.count)
 }
 if col_type == "numeric":
 nums = (float(v) for v in values if v.replace('.', '', 1).isdigit())
 summary(col)("average") = sum(nums) / len(nums) if nums else 0
 return summary

4. Make a summary:

def write_summary(summary, out_file):
    with open(out_file, "w") as f:
        for col, stats in summary.items():
            f.write(f"Column: {col}\n")
            for k, v in stats.items():
                f.write(f"  {k}: {v}\n")
            f.write("\n")

You can use such functions:

header, rows = load_csv("employees.csv")
summary = profile_columns(header, rows)
write_summary(summary, "profile_report.txt")

!! ⏭ Last Workout

Use your CSV file (or reuse the first). Run the profile and check out.

. Conclusion

Congratulations! You have completed the data science mini course. 🎉 🎉

During this week, you have moved from the basic data structure to writing modular functions and scripts that handle real data issues. These are the basics, and that’s what I mean, really basic things. I suggest you use it as a starting point and find out more about the standard library (of course).

Thank you for learning with me. Happy coding and data crying!

Pray Ca Is a developer and technical author from India. She likes to work at the intersection of mathematics, programming, data science, and content creation. The fields of interest and expertise include dupas, data science, and natural language processing. She enjoys reading, writing, coding and coffee! Currently, they are working with the developer community to learn and share their knowledge with the developer community by writing a lesson, how to guide, feed and more. The above resources review and coding also engages lessons.

For data science (free 7 day mini course)

. Introduction

. First Day: Variables, Data Types, and File I/O

!! 1. Various

!! 2. Data Types You will often use

!! 3. File input: Reading raw data

!! 4. File output: Write processed data

!! ⏭ Exercise: Write your first data script

. Day 2: Basic Data structures

!! 1. List: For the Data Queue series

!! 2. Topal: For the Fixed Structure Records

!! 3. DOCT: For labeled data (such as column)

!! 4. Set: for individuality and fast membership checks

!! ⏭ Exercise: A mini -data inspector code

. Day 3: Working with strings

!! 1. The basic wire cleaning

!! 2. Standard values

!! 3. Counting words, checking samples

!! 4. Divide the parts and extract

!! 5. Detecting specimens specified text

!! ⏭ Exercise: Clean survey comments

. Day 4: Summary with group, counting, and dictionary

!! 1. Grouping by a field

!! 2. A field summary in terms of category

!! 3. Frequency table (detection of mode)

!! ⏭ Exercise: Analyze the employee Dataset

. 5 days: written functions

!! 1. Cleaning the inputs of the text

!! 2. Making a row record

!! 3. The collector

!! ⏭ Exercise: Modify the previous work

. Day 6: Reading, writing, and handling the fundamental error

!! 1. Reading secure file

!! 2. Handle the bad rows beautifully

!! 3. Write clean data in a file

!! ⏭ Exercise: Make error tolerant loader

. Day 7: Create a mini -data profile (Project Day)

!! Step by step

!! ⏭ Last Workout

. Conclusion

The rise of AI Art: The future of opportunities, challenges, and creativity | By Wordscope Journal | September, 2025

Madzoorne Currif Cataloging – Part 7 | By David Bits | Kinomoto.mag Ai | September, 2025

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news