

Photo by Author | Canva
. Introduction
When you are new to Izgar, you usually use “loops” for the data storage whenever you have to take action. Need to square the list of numbers? Loop through them. Do they need to filter or summarize them? Loop again. This is more intuitive to us as humans because our brain thinks and works (one thing at a time).
But that does not mean that computers have to happen. They can benefit from something Vectorous thinking. Basically, instead of looping every element to operation, you like the entire list of the list, “Hey, here’s the list. Take all the operations together.”
In this tutorial, I will give you a soft introduction to how it works, why it makes a difference, and we will also cover some examples to see how beneficial it can be. So, let’s start.
. What is the vecturized thinking and why does it matter?
As has been discussed earlier, vecturized thinking means that instead of handling operations in order, we want to perform them collectively. This idea is actually influenced by matrix and vector operations in mathematics, and it enables your code to be very fast and more. Libraries like Nimp allow you to implement vector -made thinking.
For example, if you have to multiply the list of numbers by 2, then one by one instead of accessing and operating each element, you multiply the entire list simultaneously. It has great benefits, such as reducing the maximum head of the aggz. Each time you repeat through the loop, the spokesperson has to do a lot of work such as checking, managing items, and handling loop mechanics. With a vecturized approach, you reduce it by processing bulk and reducing it. It is also very fast. We will see it later with an example of the effects of performance. I have imagined what I have said in the form of an image so that you can guess what I am referring to.
Now that you think what this is, let’s see how you can enforce it and how it can be useful.
. An easy example: temperature change
Different temperature conventions are used in different countries. For example, if you are familiar with the Foreign Height Scale and is given data cells, here you can change it using both methods.
!! Loop point of view
celsius_temps = (0, 10, 20, 30, 40, 50)
fahrenheit_temps = ()
for temp in celsius_temps:
fahrenheit = (temp * 9/5) + 32
fahrenheit_temps.append(fahrenheit)
print(fahrenheit_temps)
Output:
(32.0, 50.0, 68.0, 86.0, 104.0, 122.0)
!! Vectorized approach
import numpy as np
celsius_temps = np.array((0, 10, 20, 30, 40, 50))
fahrenheit_temps = (celsius_temps * 9/5) + 32
print(fahrenheit_temps) # (32. 50. 68. 86. 104. 122.)
Output:
( 32. 50. 68. 86. 104. 122.)
Instead of dealing with each item at a time, we turn the list into a moisture array and apply the formula to all elements at the same time. Both of them act on the data and give the same result. Apart from being more comprehensive, you will not feel the time difference right now. But we will cover it soon.
. Advanced Examples: Math actions on multiple rows
Let’s take another example where we have many rows and we have to calculate profit. That is how you can do both ways.
!! Loop point of view
revenues = (1000, 1500, 800, 2000, 1200)
costs = (600, 900, 500, 1100, 700)
tax_rates = (0.15, 0.18, 0.12, 0.20, 0.16)
profits = ()
for i in range(len(revenues)):
gross_profit = revenues(i) - costs(i)
net_profit = gross_profit * (1 - tax_rates(i))
profits.append(net_profit)
print(profits)
Output:
(340.0, 492.00000000000006, 264.0, 720.0, 420.0)
Here, we are manually calculating the profit for every entry:
- Dished the cost from the revenue (gross profit)
- Apply taxes
- Add the result to a new list
Works well, but it is a lot of manual indexing.
!! Vectorized approach
import numpy as np
revenues = np.array((1000, 1500, 800, 2000, 1200))
costs = np.array((600, 900, 500, 1100, 700))
tax_rates = np.array((0.15, 0.18, 0.12, 0.20, 0.16))
gross_profits = revenues - costs
net_profits = gross_profits * (1 - tax_rates)
print(net_profits)
Output:
(340. 492. 264. 720. 420.)
The vecturized version is even more reading, and it works according to the element in all three rows simultaneously. Now, I just don’t want to repeat the “faster” without solid evidence. And you are wondering, “What is Kanwal even talking?” But now when you have seen how to implement it, let’s look at the performance difference between the two.
. Performance: Number does not lie
The difference I am talking about is not just hype or something theoretical. It is measured and proven. Let’s look at a practical standard to understand how much you can expect. We will create a huge data of 1,000,000 examples and perform operations on each element using both methods \ (X^2 + 3x + 1 \) and compare the time.
import numpy as np
import time
# Create a large dataset
size = 1000000
data = list(range(size))
np_data = np.array(data)
# Test loop-based approach
start_time = time.time()
result_loop = ()
for x in data:
result_loop.append(x ** 2 + 3 * x + 1)
loop_time = time.time() - start_time
# Test vectorized approach
start_time = time.time()
result_vector = np_data ** 2 + 3 * np_data + 1
vector_time = time.time() - start_time
print(f"Loop time: {loop_time:.4f} seconds")
print(f"Vector time: {vector_time:.4f} seconds")
print(f"Speedup: {loop_time / vector_time:.1f}x faster")
Output:
Loop time: 0.4615 seconds
Vector time: 0.0086 seconds
Speedup: 53.9x faster
It’s more than 50 times faster !!!
This is not a small correction, it will create your data processing tasks (I’m talking about major datases). I am using Nimp for this tutorial, but Pandas is another library made on the upper part of the moisture. You can use it too.
. When not to vecturize
Just because something works for most cases does not mean that this is a point of view. In programming, your “best” perspective always depends on the problem at hand. Vectorization is great when you are performing the same operation on all the dataset elements. But if your logic includes complex, early elimination, or operations that depend on the previous results, then remain on the loop -based approach.
Similarly, when working with very small datases, the overhead benefits of setting up vectoral operations can be far more than the benefits. So just use it where it is understood, and do not force it to be where it does not.
. Wrap
When you continue to work with Izigar, challenge the Challenge challenge yourself to find vectorization opportunities. When you find yourself reaching for `for ‘loop, pause and ask if there is a way to express the same operation using moisture or pandas. More frequently, there is there, and the result will be the code, which is not only faster but also more beautiful and easy to understand.
Remember, the goal is not to remove all loops from your code. This is to use The right tool for a job.
Kanwal seals Kanwal is a machine learning engineer and a technical writer who has a deep passion for data science and has AI intersection with medicine. He authored EBook with “Maximum Production Capacity with Chat GPT”. As a Google Generation Scholar 2022 for the APAC, the Champions Diversity and the Educational Virtue. He is also recognized as a tech scholar, Mitacs Global Research Scholar, and Harvard Vacod Scholar as a Taradata diversity. Kanwal is a passionate lawyer for change, who has laid the foundation of a Fame Code to empower women in stem fields.