Numpy Sorting, Searching, and Counting

Numpy’s Sorting

Sorting involves arranging elements in a specific sequence based on certain criteria, such as numerical order (ascending or descending) or alphabetical order. NumPy’s sorting functions are optimized for large datasets and can be applied along specified axes in multi-dimensional arrays.

FUNCTION DESCRIPTION
numpy.ndarray.sort() Sorts an array in place.
numpy.sort() Returns a copy of an array sorted along the first axis.
numpy.sort_complex() Sorts a complex array using the real part first, followed by the imaginary part.
numpy.partition() Returns a partitioned copy of an array.
numpy.argpartition() Performs an indirect partition along the given axis using the algorithm specified by the kind keyword.

numpy.sort()

The numpy.sort() function organizes the elements of an array in ascending order along the specified axis.

import numpy as np

DataFlair_array = np.array([5, 2, 8, 1, 3])
sorted_array = np.sort(DataFlair_array)

print("Original Array:", DataFlair_array)
print("Sorted Array:", sorted_array)

Output:
Original Array: [5 2 8 1 3]
Sorted Array: [1 2 3 5 8]

numpy.argsort()

The numpy.argsort() function returns the indices that would result in a sorted array.

DataFlair_array = np.array([5, 2, 8, 1, 3])
indices = np.argsort(DataFlair_array)

print("Array:", DataFlair_array)
print("Indices for Sorting:", indices)

Output:
Array: [5 2 8 1 3]
Indices for Sorting: [3 1 4 0 2]

numpy.lexsort()

The numpy.lexsort() function performs an indirect sort using a sequence of keys. It sorts multiple keys simultaneously.

first_name = np.array(['John', 'Jane', 'Adam', 'Eve'])
last_name = np.array(['Doe', 'Smith', 'Smith', 'Doe'])
indices = np.lexsort((last_name, first_name))

print("First Names:", first_name)
print("Last Names:", last_name)
print("Sorted Indices:", indices)

Output:
First Names: [‘John’ ‘Jane’ ‘Adam’ ‘Eve’]
Last Names: [‘Doe’ ‘Smith’ ‘Smith’ ‘Doe’]
Sorted Indices: [2 3 1 0]

numpy.ndarray.sort()

The sort() method of NumPy arrays performs an in-place sort.

DataFlair_array = np.array([5, 2, 8, 1, 3])
DataFlair_array.sort()

print("Sorted Array:", DataFlair_array)

Output:
Sorted Array: [1 2 3 5 8]

numpy.msort()

The numpy.msort() function performs a merge sort on an array.

DataFlair_array = np.array([5, 2, 8, 1, 3])
sorted_array = np.msort(DataFlair_array)

print("Original Array:", DataFlair_array)
print("Merge Sorted Array:", sorted_array)

Output:
Original Array: [5 2 8 1 3]
Merge Sorted Array: [1 2 3 5 8]

numpy.sort_complex()

The numpy.sort_complex() function sorts complex numbers based on their magnitudes.

DataFlair_complex = np.array([2+3j, 1-5j, 4+2j])
sorted_complex = np.sort_complex(DataFlair_complex)

print("Original Complex Array:", DataFlair_complex)
print("Sorted Complex Array:", sorted_complex)

Output:
Original Complex Array: [2.+3.j 1.-5.j 4.+2.j]
Sorted Complex Array: [1.-5.j 2.+3.j 4.+2.j]

numpy.partition()

The numpy.partition() function performs a partial sort along the specified axis.

DataFlair_array = np.array([5, 2, 8, 1, 3])
partitioned_array = np.partition(DataFlair_array, 2)

print("Original Array:", DataFlair_array)
print("Partitioned Array:", partitioned_array)

Output:
Original Array: [5 2 8 1 3]
Partitioned Array: [1 2 3 5 8]

numpy.argpartition()

The numpy.argpartition() function returns the indices that would partition an array.

DataFlair_array = np.array([5, 2, 8, 1, 3])
indices = np.argpartition(DataFlair_array, 2)

print("Array:", DataFlair_array)
print("Indices for Partitioning:", indices)

Output:
Array: [5 2 8 1 3]
Indices for Partitioning: [3 1 4 0 2]

Numpy’s Searching

Searching refers to the process of locating specific elements or patterns within data. NumPy’s searching functions are designed to facilitate these tasks efficiently in arrays, matrices, and multidimensional data structures.

FUNCTION DESCRIPTION
numpy.nanargmin() Returns the indices of the minimum values in the specified axis while ignoring NaN values.
numpy.argwhere() Finds the indices of array elements that are non-zero, grouping them by the element value.
numpy.nonzero() Returns the indices of elements that are non-zero in the array.
numpy.flatnonzero() Returns indices that are non-zero in the flattened version of the array.
numpy.where() Returns elements chosen from ‘x’ or ‘y’ depending on a specified condition.
numpy.searchsorted() Finds indices where elements should be inserted into an array to maintain order.
numpy.extract() Returns the elements of an array that satisfy a given condition.

numpy.argmax()

The numpy.argmax() function returns the indices of the maximum value along a specified axis.

import numpy as np

DataFlair_array = np.array([5, 2, 8, 1, 3])
max_index = np.argmax(DataFlair_array)

print("Array:", DataFlair_array)
print("Index of Maximum Value:", max_index)

Output:
Array: [5 2 8 1 3]
Index of Maximum Value: 2

numpy.nanargmax()

The numpy.nanargmax() function returns the index of the maximum value, ignoring NaN values.

DataFlair_array = np.array([5, np.nan, 8, 1, 3])
max_index = np.nanargmax(DataFlair_array)

print("Array:", DataFlair_array)
print("Index of Maximum Value (ignoring NaN):", max_index)

Output:
Array: [ 5. nan 8. 1. 3.]
Index of Maximum Value (ignoring NaN): 2

numpy.argmin()

The numpy.argmin() function returns the indices of the minimum value along a specified axis.

DataFlair_array = np.array([5, 2, 8, 1, 3])
min_index = np.argmin(DataFlair_array)

print("Array:", DataFlair_array)
print("Index of Minimum Value:", min_index)

Output:
Array: [5 2 8 1 3]
Index of Minimum Value: 3

numpy.nanargmin()

The numpy.nanargmin() function returns the index of the minimum value, ignoring NaN values.

DataFlair_array = np.array([5, np.nan, 8, 1, 3])
min_index = np.nanargmin(DataFlair_array)

print("Array:", DataFlair_array)
print("Index of Minimum Value (ignoring NaN):", min_index)

Output:
Array: [ 5. nan 8. 1. 3.]
Index of Minimum Value (ignoring NaN): 3

numpy.argwhere()

The numpy.argwhere() function returns the indices of elements that satisfy a given condition.

DataFlair_array = np.array([5, 2, 8, 1, 3])
indices = np.argwhere(DataFlair_array > 2)

print("Array:", DataFlair_array)
print("Indices of Elements > 2:", indices)

Output:

Array: [5 2 8 1 3]
Indices of Elements > 2: [[0]
[2]
[4]]

numpy.nonzero()

The numpy.nonzero() function returns the indices of non-zero elements in an array.

DataFlair_array = np.array([0, 10, 0, 25, 30, 0])
nonzero_indices = np.nonzero(DataFlair_array)

print("Array:", DataFlair_array)
print("Indices of Non-zero Elements:", nonzero_indices)

Output:
Array: [ 0 10 0 25 30 0]
Indices of Non-zero Elements: (array([1, 3, 4]),)

numpy.flatnonzero()

The numpy.flatnonzero() function returns indices of non-zero elements in a flattened array.

DataFlair_array = np.array([0, 10, 0, 25, 30, 0])
flat_nonzero_indices = np.flatnonzero(DataFlair_array)

print("Array:", DataFlair_array)
print("Indices of Non-zero Elements (Flattened):", flat_nonzero_indices)

Output:
Array: [ 0 10 0 25 30 0]
Indices of Non-zero Elements (Flattened): [1 3 4]

numpy.where()

The numpy.where() function returns the indices of elements that satisfy a condition.

DataFlair_array = np.array([5, 2, 8, 1, 3])
indices = np.where(DataFlair_array > 2)

print("Array:", DataFlair_array)
print("Indices of Elements > 2:", indices)

Output:
Array: [5 2 8 1 3]
Indices of Elements > 2: (array([0, 2, 4]),)

numpy.searchsorted()

The numpy.searchsorted() function conducts a binary search within a sorted array to determine the positions where elements should be added to preserve their order.

sorted_array = np.array([1, 3, 5, 7, 9])
indices = np.searchsorted(sorted_array, 6)

print("Sorted Array:", sorted_array)
print("Index to Insert 6:", indices)

Output:
Sorted Array: [1 3 5 7 9]
Index to Insert 6: 3

numpy.extract()

The numpy.extract() function returns elements from an array that satisfy a condition.

DataFlair_array = np.array([5, 2, 8, 1, 3])
condition = DataFlair_array > 2
extracted_elements = np.extract(condition, DataFlair_array)

print("Array:", DataFlair_array)
print("Extracted Elements:", extracted_elements)

Output:
Array: [5 2 8 1 3]
Extracted Elements: [5 8 3]

Numpy’s Counting

Counting in the context of data manipulation involves quantifying occurrences, frequencies, or unique values within a dataset. NumPy’s counting functions provide efficient ways to extract essential statistical insights from arrays and matrices.

numpy.count()

The numpy.count() function counts the occurrences of a specific value in an array.

DataFlair_array = np.array([1, 2, 2, 3, 3, 3])
count_of_3 = np.count(DataFlair_array, 3)

print("Array:", DataFlair_array)
print("Count of 3:", count_of_3)

Output:
Array: [1 2 2 3 3 3]
Count of 3: 3

numpy.bincount()

The numpy.bincount() function counts occurrences of non-negative integers in an array.

import numpy as np

DataFlair_array = np.array([1, 2, 2, 3, 3, 3])
bin_counts = np.bincount(DataFlair_array)

print("Array:", DataFlair_array)
print("Bin Counts:", bin_counts)

Output:
Array: [1 2 2 3 3 3]
Bin Counts: [0 1 2 3]

numpy.unique()

The numpy.unique() function returns unique elements and their counts.

DataFlair_array = np.array([1, 2, 2, 3, 3, 3])
unique_elements, counts = np.unique(DataFlair_array, return_counts=True)

print("Array:", DataFlair_array)
print("Unique Elements:", unique_elements)
print("Counts:", counts)

Output:
Array: [1 2 2 3 3 3]
Unique Elements: [1 2 3]
Counts: [1 2 3]

numpy.histogram()

The numpy.histogram() function computes the histogram of a dataset.

DataFlair_array = np.array([2, 5, 7, 10, 15, 20, 25, 30])
hist, bin_edges = np.histogram(DataFlair_array, bins=[0, 10, 20, 30])

print("Array:", DataFlair_array)
print("Histogram:", hist)
print("Bin Edges:", bin_edges)

Output:
Array: [ 2 5 7 10 15 20 25 30]
Histogram: [2 3 3]
Bin Edges: [ 0 10 20 30]

numpy.count_nonzero()

The numpy.count_nonzero() function counts the number of non-zero elements in an array.

DataFlair_array = np.array([0, 10, 0, 25, 30, 0])
nonzero_count = np.count_nonzero(DataFlair_array)

print("Array:", DataFlair_array)
print("Count of Non-zero Elements:", nonzero_count)

Output:
Array: [ 0 10 0 25 30 0]
Count of Non-zero Elements: 3

numpy.count_zero()

The numpy.count_zero() function counts the number of zero elements in an array.

DataFlair_array = np.array([0, 10, 0, 25, 30, 0])
zero_count = np.count_zero(DataFlair_array)

print("Array:", DataFlair_array)
print("Count of Zero Elements:", zero_count)

Output:

Array: [ 0 10 0 25 30 0]
Count of Zero Elements: 3

Conclusion

In this brief tutorial, we’ve witnessed how NumPy’s capabilities can swiftly organize, pinpoint, and quantify information. Sorting enables us to identify patterns, searching helps us find critical elements, and counting unveils valuable insights about data distribution.

As you delve deeper into the realm of data science, remember that NumPy’s features are your secret weapons for efficient and effective analysis. With these techniques at your disposal, you’re empowered to extract meaningful insights from your data, driving your blog’s analytics and discoveries to new heights. Happy Coding!