[Python] Introduction to Python Programming

This post will introduce you to Python fundamentals. These basics are essential before we move on to pandas, the library that makes working with datasets straightforward and powerful.

Why Python?

Python has become the standard language for data analysis across disciplines. Unlike paid software like SPSS or Excel, Python is free, open-source, and supported by a large community. More importantly, it integrates well with AI tools like ChatGPT and Claude, which can help you write and debug code as you learn.

Your First Python Program: Hello World

Every programming journey starts with “Hello World.” This tradition dates back to the 1970s, and it’s basically a programmer’s way of saying “I exist!” to the computer 🙂

Open a Python environment (IDLE, Jupyter Notebook, or Google Colab) and type:

Python
print("Hello World")

When you run this, you’ll see:

Hello World

Congratulations! You’ve just joined millions of programmers who started exactly this way. The print() function displays text or data on your screen. This simple command is more useful than it seems—you’ll use it constantly to check what your code is doing, kind of like leaving yourself breadcrumbs to follow.

Variables: Storing Information

Variables let you store data for later use. Think of them as labeled boxes that hold information—or better yet, like sticky notes where you write something down and slap a label on it so you can find it later. In general programming language, one equal sign (=) means that you “assign” something.

Python
name = "Maria" # You assign Maria to name variable 
age = 28 # You assign value 28 to age variable
is_student = True # You assign True value to is_student variable

Here, name stores text, age stores a number, and is_student stores a true/false value. The = sign doesn’t mean “equals” like in math—it means “assign this value to this variable.” Think of it as an arrow pointing left: “take what’s on the right and put it in the box on the left.”

You can use these variables throughout your code:

Python
print(name)  # Output: Maria
print(age)   # Output: 28

Variable names should be descriptive. Instead of x, use client_age or survey_response. Your future self (reading the code at 2am before a deadline) will thank you. It’s like writing clear case notes instead of cryptic abbreviations that nobody remembers a month later.

Data Types: The Building Blocks

Python has several basic data types. Understanding these is essential for working with data.

Strings (Text)

Strings represent text and are enclosed in quotes, either “” or ”:

Python
first_name = "John"
last_name = 'Doe'

You can combine strings using + (concatenation):

Python
full_name = first_name + " " + last_name
print(full_name)  # Output: John Doe

You can also count the number of character of string variable and make it into upper or lower case:

Python
message = "Hello, world!"
print(len(message))        # Length: 13
print(message.upper())     # HELLO, WORLD!
print(message.lower())     # hello, world!

Numbers

Python has two main numeric types:

Python
# Integers (whole numbers)
num_clients = 45
year = 2024

# Floats (decimal numbers)
average_score = 3.7
percentage = 85.5

You can perform mathematical operations. Simply you are using it as a calculator:

Python
total = 100
completed = 67
completion_rate = (completed / total) * 100
print(completion_rate)  # Output: 67.0

Booleans (True/False)

Booleans represent binary states (yes/no):

Python
is_active = True
has_insurance = False
meets_criteria = age >= 18 and is_active

These are particularly useful for filtering data and conditional logic.

Collections: Working with Multiple Values

Real-world data rarely comes as single values. Python provides several ways to store collections of data.

Lists

Lists are ordered collections that can hold multiple items:

Python
client_ages = [25, 34, 42, 28, 51]
service_types = ["counseling", "housing", "employment", "healthcare"]

You can access items by their position (starting from 0):

Python
print(client_ages[0])      # First item: 25
print(service_types[2])    # Third item: employment

🤔 Wait, why does the first item have index 0? Welcome to one of programming’s quirks. Programmers count from 0, not 1. Think of it like floor numbers in some countries, the ground floor is 0, the first floor up is 1.

Lists are mutable, meaning you can change them after creating them:

Python
client_ages.append(39)     # Add a new age
print(client_ages)         # [25, 34, 42, 28, 51, 39]

service_types[1] = "food assistance"
print(service_types)       # ['counseling', 'food assistance', 'employment', 'healthcare']

Think of a list like a client intake form where you can add new information or update existing entries.

Dictionaries

Dictionaries store data as key-value pairs. The name comes from actual dictionaries: you look up a word (the key) to find its definition (the value).

Python
client = {
    "id": 1001,
    "name": "Sarah Johnson",
    "age": 32,
    "services": ["counseling", "housing"]
}

print(client["name"])      # Sarah Johnson
print(client["age"])       # 32

Think of a dictionary like a client file folder where each piece of information has a clear label. You don’t have to remember “the client’s name is the third item”—you just ask for client["name"] and you get it. This is much more intuitive than remembering positions like you do with lists.

Dictionaries are incredibly useful for representing structured data, like client records, survey responses, or program information. In fact, most real-world data you’ll work with (JSON from APIs, database records) uses this key-value structure.

Basic Operations and Control Flow

Conditional Statements

Conditional statements let your code make decisions:

Python
age = 17

if age >= 18:
    print("Adult services available")
else:
    print("Youth services available")

You can check multiple conditions:

Python
income = 25000

if income < 20000:
    eligibility = "Full subsidy"
elif income < 40000:
    eligibility = "Partial subsidy"
else:
    eligibility = "No subsidy"

print(eligibility)

Loops

Loops let you repeat operations, which is essential when working with data. Imagine you have 500 client records and need to calculate something for each one. You could copy-paste your code 500 times, or you could write a loop. (Spoiler: choose the loop. Your keyboard will thank you.)

For loops iterate over collections:

Python
ages = [23, 45, 34, 56, 28]

for age in ages:
    print(f"Client age: {age}")

This is like saying “for each client in my caseload, do this thing.” The computer does it instantly instead of you manually doing it one by one.

You can calculate statistics:

Python
ages = [23, 45, 34, 56, 28]
total = 0

for age in ages:
    total = total + age

average = total / len(ages)
print(f"Average age: {average}")  # Average age: 37.2

Without a loop, you’d have to write total = 23 + 45 + 34 + 56 + 28. Now imagine doing that for 500 numbers. Loops are basically automation—the computer does the tedious repetitive stuff so you can focus on analysis and interpretation.

While loops continue until a condition is met:

Python
count = 0
while count < 5:
    print(f"Count: {count}")
    count = count + 1

While loops are useful when you don’t know in advance how many times you need to repeat something—like “keep asking for input until the user enters a valid response.”

Functions: Reusable Code

Functions let you package code for reuse. Instead of writing the same code repeatedly, you define a function once and call it whenever needed:

Python
def calculate_risk_score(has_housing, has_income, has_support):
    score = 0
    if not has_housing:
        score += 3
    if not has_income:
        score += 2
    if not has_support:
        score += 2
    return score

# Use the function
client_risk = calculate_risk_score(False, True, False)
print(f"Risk score: {client_risk}")  # Risk score: 5

Functions make your code organized and easier to maintain.

Python Quirks and Conventions

Before we move on to a complete example, let’s talk about some Python-specific quirks that aren’t really “concepts” but will absolutely trip you up if you don’t know about them.

Indentation: Whitespace Actually Matters

In Python, indentation is part of the syntax.

Python
# This works:
if age >= 18:
    print("Adult")
    print("Can vote")

# This causes an error:
if age >= 18:
print("Adult")  # IndentationError!

Python uses indentation to define code blocks. Most people use 4 spaces per indentation level.

The Great Tab vs Space Debate: Some people use tabs, some use spaces. Python doesn’t care which you pick, but you CANNOT mix them in the same file. Pick one and stick with it. Most style guides recommend spaces (4 of them), and most modern editors can convert tabs to spaces automatically. This debate is so iconic among programmers 🙂

Case Sensitivity: Capitalization Matters

Python treats Name, name, and NAME as three completely different variables.

Python
client_name = "Maria"
Client_name = "John"
CLIENT_NAME = "Sarah"

print(client_name)    # Maria
print(Client_name)    # John
print(CLIENT_NAME)    # Sarah

This seems obvious but will bite you when you’re debugging at midnight wondering why Age isn’t working when you defined age.

Naming Conventions: snake_case Rules

Python has a preferred naming style called “snake_case” for variables and functions—all lowercase with underscores between words:

Python
# Good Python style
client_age = 28
total_income = 45000
calculate_risk_score()

# Bad style (works, but not Pythonic)
ClientAge = 28
totalIncome = 45000
CalculateRiskScore()

The latter styles (PascalCase and camelCase) are used in other languages like Java or JavaScript. Python uses them only for class names, which you’ll learn later. Following these conventions makes your code look “Pythonic” and helps other Python programmers read your code.

Comments: Talking to Your Future Self

Comments are notes in your code that Python ignores. They start with #:

Python
# This is a comment
age = 28  # You can also put comments at the end of lines

# Comments are for explaining WHY, not WHAT
# Bad comment:
total = total + 1  # Add 1 to total

# Good comment:
total = total + 1  # Increment counter for each completed session

Write comments like you’re leaving notes for someone who will read your code in 6 months—because that someone is usually you, and you will have forgotten everything. Comments are also useful for temporarily “turning off” code without deleting it:

Python
# print("Debug message")  # Commented out for now

Reading Error Messages: Don’t Panic

Error messages look scary but they’re actually trying to help. When you get an error, read from bottom to top:

Python
Traceback (most recent call last):
  File "script.py", line 10, in <module>
    result = calculate_score(age, income)
  File "script.py", line 5, in calculate_score
    return score / count
ZeroDivisionError: division by zero

The last line tells you WHAT went wrong (ZeroDivisionError). The lines above tell you WHERE (line 5, in the calculate_score function). The error message is like a breadcrumb trail showing you exactly where to look.

Common errors you’ll see:

  • SyntaxError: You wrote something Python doesn’t understand (missing colon, wrong indentation)
  • NameError: You tried to use a variable that doesn’t exist (typo?)
  • TypeError: You tried to do something with the wrong data type (like adding a number to a string)
  • IndentationError: Your spacing is inconsistent

Python Is Forgiving… Until It Isn’t

Python tries to be helpful and will often “guess” what you mean. But sometimes it guesses wrong:

Python
# Python allows this (implicit conversion):
result = "Client " + str(123)  # "Client 123"

# But this will error:
result = "Client " + 123  # TypeError: can only concatenate str to str

When something isn’t working, check your data types. Use type() to see what you’re actually working with:

Python
age = "28"  # Wait, is this a string or a number?
print(type(age))  # <class 'str'> - it's a string!

A Simple Example

Here’s a practical example that combines what we’ve covered:

Python
# Client data
clients = [
    {"name": "Alice", "age": 28, "income": 18000},
    {"name": "Bob", "age": 45, "income": 32000},
    {"name": "Carol", "age": 34, "income": 25000}
]

# Function to determine eligibility
def check_eligibility(income):
    if income < 20000:
        return "Eligible for full assistance"
    elif income < 30000:
        return "Eligible for partial assistance"
    else:
        return "Not eligible"

# Process each client
for client in clients:
    status = check_eligibility(client["income"])
    print(f"{client['name']}, age {client['age']}: {status}")

Output:

Alice, age 28: Eligible for full assistance
Bob, age 45: Not eligible
Carol, age 34: Eligible for partial assistance

Why This Matters: Moving Toward pandas

The structures we’ve covered—lists, dictionaries, loops—are powerful, but they become cumbersome with large datasets. Imagine you have survey data from 500 respondents with 50 questions each. Managing this with basic Python structures would be tedious and error-prone.

This is where pandas comes in. pandas provides a DataFrame structure that handles tabular data efficiently—think of it as a supercharged spreadsheet that you can manipulate with code. Everything you’ve learned here (data types, loops, conditionals, functions) applies to pandas, but pandas adds specialized tools for data analysis.

In the next post, we’ll introduce pandas and show how it transforms the way you work with datasets. You’ll see how operations that would take dozens of lines of basic Python code can be done in one or two lines with pandas.

Resources

  • Official Python Tutorial: https://docs.python.org/3/tutorial/
  • Python for Everybody (free course): https://www.py4e.com/
  • Real Python (tutorials): https://realpython.com/

In the next post, we’ll dive into pandas and see how it transforms your ability to work with real-world datasets.

  • December 22, 2025