Visualizing and Predicting Classes with Scatterplot and KNN

Before testing our machine learning model, we need to visualize its behavior with new data points. This visualization step is crucial for understanding how our classifier will handle previously unseen data. We'll start by incorporating our new X and Y coordinates into our existing dataset using X.append(new_x) and Y.append(new_y). Execute this code to update your data structure.

With our expanded dataset ready, we can regenerate our scatter plot visualization. The plotting infrastructure we established earlier remains fully configured, requiring only a simple execution to display our updated results.

Our enhanced scatter plot maintains the same X and Y coordinate mapping as before, but now includes a text annotation to clearly identify our unclassified data point. Since we appended the new coordinates to our X and Y arrays, the plotting function automatically includes this point alongside our training data. Notice the distinctive label that marks our target prediction point.

This labeled point represents our unclassified data awaiting prediction. To improve visual clarity and distinguish this new point from our training data, let's assign it a unique class designation that will render in a different color.

We'll create a temporary visualization variable by generating a copy of our existing classes using Python's built-in list copy method. This approach preserves our original class structure while allowing us to append a new identifier—in this case, the value "2"—which will trigger a distinct color scheme for our unclassified point. This color differentiation is essential for clear visual analysis in machine learning workflows.

Examining our classes_copy variable confirms the successful addition of our new identifier at the end of the array. This "2" value now corresponds directly to our newly added data point, establishing the visual mapping we need.

Now we'll regenerate our scatter plot with enhanced visual distinction. The plot maintains the same X and Y coordinate system, but now utilizes our classes_copy array as the color argument (C parameter). This configuration ensures our new data point receives its unique visual treatment.

The resulting visualization clearly separates our data: purple points represent one class, green points indicate another, and our unclassified target point appears in yellow. This color coding provides immediate visual feedback about our data distribution and the positioning of our prediction target. With our visualization complete, we're ready to engage our trained model for actual prediction.

The prediction process begins with proper data formatting. We'll create a data point tuple containing our new coordinates: data_point = (new_x, new_y). This tuple structure matches the input format our KNN model expects for prediction operations.

Examining our data point variable confirms the tuple structure: (9,19). This coordinate pair represents the exact location in our feature space where we want our model to make its classification prediction.

Now we'll invoke our KNN model's prediction capability using the predict method. This method requires a list input format, similar to the X_test array we used during model validation. Although we're predicting for a single data point, we must still provide it within a list structure to match the expected API format.

The prediction operation returns our result in array format, which is standard practice since most prediction workflows handle multiple data points simultaneously. This array structure maintains consistency across different batch sizes, from single predictions to large-scale inference operations.

To access our specific prediction result, we need to extract the first element using prediction[0]. This indexing step retrieves the actual classification value from the array wrapper, giving us the model's decision for our data point.

The output demonstrates the distinction between the array container and the actual prediction value. The first line shows the complete array structure, while the second line displays the isolated prediction result for direct comparison and analysis. This prediction value represents our model's classification decision based on the nearest neighbor analysis.

With our prediction complete, we're ready to incorporate this result into our visualization framework, which we'll accomplish in the following section.

Related Articles

Basic Excel Calculations and Order of Operations

Paste Special: Excel Skills with Key Techniques

Building a Three-Layer Neural Network with Keras and TensorFlow