Teaching How to Use Python in Excel Part 2: Lists, Loops, and Conditions

12 Sep 23

Python Lists

A List in Python is a data type that can store multiple pieces of data, and each piece of data can be of any type. Lists can contain duplicates, and the order of the data items is significant, allowing us to reference their positions.

Suppose we have an original List like this:

colorList=["red","green","blue","yellow"]

Accessing Items in a List:

To access items in a List, we can use the following syntax:

ListName[index]

We can reference index numbers in two ways:

Positive index: It starts from the first item with index 0, the second item has an index of 1, and so on.
Negative index: It counts backward, with the last item having an index of -1, the second-to-last item at -2, and so forth.

For example:

colorList[1] retrieve the second item, which is the string “green.”
colorList[-3] refers to the item at the third position from the end, which is also “green.”

Slicing to Select a Range of Items in a List

We can slice and select a portion of a List using the following method:

ListName[start index : end index (not included)]
ListName[ indexเริ่ม : แต่ไม่ถึง indexจบ ]

For example:

colorList[1:3]
# Selects from index 1 (the second item) up to, but not including, index 3 (up to the third item).

In summary, [1:3] means selecting items from the second to the third item.

If you want to start from the beginning, you can use colorList[0:3], or you can omit the start index altogether, like colorList[:3]

colorList[:3]
# Selects from the beginning up to, but not including, index 3 (up to the third item).

In summary, [:3] means selecting items from the first to the third item.

To select items up to the last one, you can omit the end index:

colorList[1:]
# เอาตั้งแต่ index1 (เริ่มตัวที่2) จนถึงตัวสุดท้าย

In summary, [1:] means selecting items from the second to the last one.

You can also use negative indices as before:

colorList[1:-1]
# Selects from index 1 (the second item) to the end.

In summary, [1:-1] means selecting items from the second to the second-to-last item.

Combining Lists Together

We can merge items from two Lists together using the + operator, making it straightforward. For example:

MyList1=["red","green","blue"]
MyList2=[55,20,200]
combinedList=MyList1+MyList2

However, if you want to merge them as pairs, you can use zip:

MyList1=["red","green","blue"]
MyList2=[55,20,200]
zipObject=zip(MyList1,MyList2)
zipList=list (zipObject)
print(zipList)

Starting a Loop:

Once we are familiar with Lists as objects capable of storing multiple items, let’s take a look at how to use loops. The structure of a for loop in Python is as follows:

for item in Object with a Sequence: # Must include a colon (:) at the end 
      # Commands to be repeated in the loop
      # Indented with a tab to indicate they are part of the loop 
# Commands not related to the loop

Looping Through a List

Suppose you want to loop through the colorList you’ve created and print the color names in uppercase. Below is a detailed explanation of how to achieve this:

You can name the loop variable (the one that comes after “for”) anything you like, such as i, x, num, char, color, blahblah, or anything else. In this case, we’ve named it “color.” During each iteration of the loop, the “color” variable will change to one of the strings “red,” “green,” “blue,” or “yellow.”

To convert the color names to uppercase, you can use the upper() method. This method is embedded in string objects and is used to convert text to uppercase.

Here’s the code to loop through the list, convert each color to uppercase, and print them:

for color in colorList:
    UpColor=color.upper()    # Create a variable UpColor to store the uppercase version
    print(UpColor)            # Print UpColor in the console
print("This is outside the loop")

The result will be that each color name is printed in uppercase in the console. In Excel, it will be displayed in the Diagnostics section on the right (you can drag it out). The loop will print each color once because there are four colors in the list, and it iterates through each one. The line “print(“This is outside the loop”)” is only printed once because it’s not indented and is outside the loop.

In practice, you can simplify the loop like this:

for color in colorList:
    print(color.upper())

If you want the output to be a dynamic array in a cell, you need to ensure that the result is in a list because Excel won’t display NoneType.

One way to do this is to create an empty list and append each item to it during the loop, as shown below. Then you can call this list later:

UpperColorList=[]                      # Create an empty UpperColorList
for color in colorList:
    UpperColorList.append(color.upper()) # Append each item to the list
UpperColorList                         # Display UpperColorList

List Comprehension

Nevertheless, we can create a new List more easily than the approach described above, using a technique called List Comprehension. It’s like writing the shortest possible for loop that results in a new List, all without having to write lengthy code. For example, in the format:

[result for item in SequenceObject]

For instance:

[color.upper() for color in colorList]

Now, let’s look at looping through items that aren’t Lists directly.

Looping Through a Range

If you want to loop through a sequence of six numbers starting from 0, you can use the range(6) function. It’s an object similar to a List that contains the numbers 0, 1, 2, 3, 4, and 5.

To print numbers from 0 to 5 in the console, using the variable i to represent each number during each iteration of the loop:

for i in range(6):
    print(i)

However, if you want it to be a dynamic array in a cell, you can use List Comprehension:

[i for i in range(6)]

This immediately gives you a new List, making it simple.

If you want numbers that don’t start from 0, you can do this using:

range(start number, up to but not including end number)

เช่น

[i for i in range(3,10)]

This will give you numbers from 3 to 9 (not including 10).

You can also specify a step greater than 1 using:

range(start number, up to but not including end number, step)

For example, to increment by 2.

[i for i in range(3,9,2)]

Returning to range(6), which represents numbers from 0 to 5:

If you don’t need a new List as the result, you might not need to use List Comprehension. For instance, to calculate the cumulative sum of numbers from 0 to 5:

x=0
for i in range(6):
    x=x+i
x

Or, if you want to use List Comprehension to create a List of results first (stored in the variable MyList) and then sum all the values in the List:

MyList=[i for i in range(6)]
x = sum(MyList)

Looping Through a String

In addition to looping through Lists and Ranges, we can also loop through Strings or text. This involves iterating through each character of the text, which can be quite powerful.

Here’s an example:

MyList=[]                # Create an empty List
for char in "thepexcel":
    MyList.append(char)    # Add each character to the List
MyList                    # Display MyList

However, using List Comprehension makes this much simpler:

MyList=[char for char in "thepexcel"]

Both of these methods achieve the same result, where each character of the string “thepexcel” is added to the List “MyList”

Using Conditions

We can use “if” to create certain conditions.

In the case where a condition is true, we can specify what to do (and do nothing if it’s not true).

Here’s the basic structure:

if condition:
    #action1 if true
    #action2 if true

For example, I’ll show you two cases: one where the condition is true, and another where it’s not. In the case where the condition is true, the specified actions will be executed. If the condition is not met, nothing will happen.

result1="original1"
score=45
if score>10:
    result1="score1 is greater than 10"

result2="original2"
score=5
if score>10:
    result2="score2 is greater than 10"
    
[result1,result2]

When writing if statements like this, if there’s only one action to perform, we can simplify it to avoid indentation:

if condition: action_if_true

For example, in this case where I’m showing you the true condition:

result1="original1"
score=45
if score>10:result1="score1 is greater than 10"
    
result1

In the case where the condition is true, perform certain actions, and if it’s not true, do something else.

In addition to using if, we can also use else to handle the case where the condition is not true. The structure looks like this:

if condition:
    #action if true
else:
    #action if false

เช่น

result="original"
score=5
if score>10:
    result="score is greater than 10"
else:
    result="score is not greater than 10"
result

If you have only one action for each condition, you can simplify it even further using the ternary conditional expression:

action_if_true if condition else action_if_false

However, note that in this form, you cannot use it to assign values to variables directly. So, I’ll use print to demonstrate the concept instead of assigning a value to score.

score=5
print("score is  greater than 10") if score>10 else print("score is not greater than 10")

Try Using Conditions in Loops

Let’s take a look at an example of using conditions in loops.

Suppose we want to iterate over a string to examine each character. If we find the character ‘e’, we want to convert it to uppercase using the .upper() method within the if condition.

MyList=[]                   # Create an empty list
for char in "thepexcel":
    if char =="e":
        char=char.upper()    # Convert to uppercase if it's 'e'
    MyList.append(char)       # Add each character to the list
MyList                       # Display MyList

However, in this case, using list comprehension in the form of simply an action for true conditions will not work as expected:

MyList=[char.upper() for char in "thepexcel" if char=="e" ]

This approach creates a new list by “selecting only items that meet the condition,” which is the character ‘e’ in this case. It will result in only the ‘E’ characters in the list (ignoring the ‘e’s).

If you want to keep all characters unchanged, you should specify both the true and false conditions using the if-else structure, which will require a slight rearrangement:

MyList=[char.upper() if char=="e" else char for char in "thepexcel" ]

This way, it will work as intended.

Applying it:

Loop to Modify DataFrame Column Names

Suppose you have a DataFrame and you want to retrieve the column names as a list. You can do it like this:

df = xl("IrisDataSet[#All]", headers=True)
dfColList=list(df.columns)

Now, you want to iterate through the column names and convert them to uppercase only if they contain an underscore (_):

NewdfColList=[ col.upper() if "_" in col else col for col in dfColList ]

You can assign the new column names back to the original DataFrame:

df.columns = NewdfColList
df

Actually, you can write everything together in a single cell of code, like this:

df=xl("IrisDataSet[#All]", headers=True)
dfColList=list(df.columns)
NewdfColList=[ col.upper() if "_" in col else col for col in dfColList ]
df.columns = NewdfColList
df

Select and display only specific columns of the DataFrame.

Now, once we have organized the list, we can use it to select the desired columns. For example, if I always need columns 1, 4, and 5, I can do it like this:

SelectCol=[NewdfColList[i-1] for i in [1,4,5]]

Then, I can take this list of column names and apply it to the DataFrame to select only the columns I want, like this:

df[SelectCol]

Reference: https://www.thepexcel.com/python-in-excel-ep01/