Let's examine the solution for this DataFrame slicing challenge. We need to extract the last three rows and columns using iloc first. Start with cars.iloc—and here's a crucial point for Python developers: it's surprisingly easy to forget the .iloc or .loc accessor, especially when you're deep in regular Python programming patterns. Even experienced practitioners make this mistake regularly.
With .iloc, we need to specify positional indices for our last three rows: 154, 155, and 156. However, remember that iloc uses exclusive upper bounds—a fundamental aspect of Python's slicing behavior. To get rows 154-156, we must specify the range as 154:157, even though row 157 doesn't exist. This "up to but not including" logic ensures we capture exactly the rows we want.
For the columns in this 16-column dataset, we want the last three: columns 14, 15, and 16. Following the same exclusive upper bound rule, we specify this as 14:17. The syntax tells pandas to include columns 14 and 15, then stop before the non-existent column 17.
But wait—let's test this approach and see what happens. The result reveals a classic programming error that even seasoned data scientists encounter regularly.
We successfully retrieved three rows, but only two columns appeared. This is the infamous "off-by-one error"—so common in programming that it has an official name and countless debugging hours attributed to it. The issue stems from zero-based indexing: while we have 16 columns total, they're indexed 0 through 15, not 1 through 16.
The correct last three columns are indexed 13, 14, and 15. To capture these with iloc's exclusive upper bound, we need the range 13:16. This fundamental indexing principle trips up developers regularly, particularly when switching between different programming contexts or working under pressure.
Testing the corrected syntax cars.iloc[154:157, 13:16] produces the expected result. However, this hardcoded approach introduces a significant maintainability problem that affects real-world data workflows.
Consider what happens when your dataset changes—a common scenario in production environments. Add one more row to your data, and suddenly indices 154-156 no longer represent the "last three" rows. The same applies to columns: add a new feature, and your hardcoded column indices become obsolete. This brittleness makes your code fragile and error-prone in dynamic data environments.
Python's negative indexing provides an elegant solution that makes your code more semantic and maintainable. Instead of hardcoding specific indices, use cars.iloc[-3:] for rows and cars.iloc[:, -3:] for columns. The syntax [-3:] means "from the third-to-last element onward," with the omitted end value defaulting to "until the end."
This approach offers multiple advantages: it clearly communicates intent (we want the last three elements), eliminates off-by-one errors, and automatically adapts to dataset changes. Whether your DataFrame has 100 or 1000 rows, [-3:] always captures the final three. The code becomes self-documenting and robust.
Now let's explore the equivalent operation using loc, which operates on labels rather than positions. I'll create a separate code block to preserve both examples for comparison—a best practice when demonstrating alternative approaches.
With cars.loc, we still use numeric indices for rows (154:156) since our DataFrame uses default integer row labels. However, loc uses inclusive bounds—a critical distinction from iloc. The range 154:156 includes both endpoints, capturing rows 154, 155, and 156 directly. No need for the 157 we required with iloc.
The real advantage of loc emerges with columns, where we can use meaningful labels instead of cryptic numbers. Instead of remembering that columns 13-15 represent our target data, we specify the actual column names: 'Fuel Efficiency':'Power Perf Factor'. This label-based approach makes code significantly more readable and maintainable.
Verifying our results confirms that both iloc and loc produce identical output, but the loc version communicates intent more clearly through descriptive column names.
We can make the loc approach more dynamic by calculating row indices programmatically. Since we're working with numeric row labels rather than true indices, we can't simply use -3. Instead, calculate the starting position: len(cars.index) - 3 gives us the third-to-last row index. Combined with Python's slice notation, len(cars.index) - 3: creates a range from that calculated position to the end.
This programmatic approach mirrors the flexibility we achieved with negative indexing in iloc, ensuring our code remains robust as datasets evolve. The final solution elegantly combines calculated row positioning with descriptive column labels, resulting in code that's both maintainable and readable.
That completes our exploration of DataFrame slicing techniques. These patterns—avoiding hardcoded indices, leveraging negative indexing, and choosing between positional and label-based selection—form the foundation of robust pandas data manipulation. Master these concepts, and you'll write more resilient, maintainable code for your data science workflows.