{"id":80080,"date":"2020-10-26T09:00:55","date_gmt":"2020-10-26T03:30:55","guid":{"rendered":"https:\/\/techvidvan.com\/tutorials\/?p=80080"},"modified":"2020-10-26T09:00:55","modified_gmt":"2020-10-26T03:30:55","slug":"machine-learning-k-means-clustering","status":"publish","type":"post","link":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/","title":{"rendered":"K-Means Clustering in Machine Learning"},"content":{"rendered":"<p>In this article, we will specifically focus on one of the popular algorithms of clustering i.e., <strong>K-means clustering<\/strong>. This is one of the most used clustering algorithms.<\/p>\n<p>Here, we will mainly look at how the algorithm works, what are its benefits and shortcomings. We will discuss its various areas of usage. So, let\u2019s begin.<\/p>\n<h3>What is K-means Clustering?<\/h3>\n<p>The algorithm that we will now dive into comes under unsupervised learning. Here, we deal with data that isn\u2019t labelled and unsupervised learning generally, uses input <strong>vectors<\/strong> to draw information from the datasets.<\/p>\n<p>Well, the premise of the k-means clustering is that it divides the dataset into similar and non-similar data and it clusters them.<\/p>\n<p>Let\u2019s say, if there are two different varieties of data in a dataset, then we will have two clusters. Here, \u2018k\u2019 is a fixed value. The number we equate \u2018k\u2019 with, is the number of clusters the algorithm will form. Also, in mathematics, there is a concept known as <strong>vector quantization<\/strong>.<\/p>\n<p>K-means clustering is a part of this concept. Also, the K-means algorithm is a part of the much bigger Expectation-Maximization algorithm.<\/p>\n<p>We have another technique known as the elbow point technique used for determining the value of k. We will study about it in this article.<\/p>\n<p><strong>Note:- The second meaning of K is that it is a hyper-parameter. A hyper-parameter is a variable whose value we set before the training phase of the algorithm.<\/strong><\/p>\n<p>Well, it\u2019s always better to understand with a real-life example.<\/p>\n<p>If we have a dataset of let\u2019s say, junk-food eaters and healthy food eaters, the value of k here will be 2. Since we have two clusters that can form over here. We have shown it graphically in x-y axes terms as well. This way we can also find categories of data like only junk-food eaters, only healthy-food eaters.<\/p>\n<p>Also, varying categories like junk-food eaters who also consume healthy food and vice versa. The numbers can vary, but that\u2019s exactly the aim here.<\/p>\n<p>To understand various hidden patterns in the data. If we can picture these:<\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image01.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80117\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image01.jpg\" alt=\"k-means clustering\" width=\"800\" height=\"628\" \/><\/a><\/p>\n<p>This plot shows the distribution of the data. This has numerous combinations of data available in it. Now, let\u2019s see how the cluster forms.<\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image02.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80118\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image02.jpg\" alt=\"Cluster in ML\" width=\"900\" height=\"628\" \/><\/a><\/p>\n<p>The cross-marks are the <strong>centroids<\/strong> of the clusters. The algorithm here identifies the centroids of the clusters. Centroids can be both imaginary and real if taken, but the main aim is to find the centroids and to cluster data around them.<\/p>\n<h3>How does K-means clustering work?<\/h3>\n<p>Well, here let\u2019s try to understand how exactly this algorithm works. Let\u2019s see a basic explanation first and then dive into the deeper concepts.<\/p>\n<ul>\n<li>Everything begins with the value of \u2018k\u2019. The preset value will give the algorithm a start on how many clusters should it form.<\/li>\n<li>At first, the algorithm allocates the number of clusters (let\u2019s say k=2 for now). These are just random points for now and are not central.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image03.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80119\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image03.jpg\" alt=\"working of k-means clustering\" width=\"800\" height=\"628\" \/><\/a><\/p>\n<p>These two above points are just random.<\/p>\n<ul>\n<li>The algorithm will now calculate the distance between the centroid and the points.<\/li>\n<li>Remember that the distance for one point will be calculated for both the centroids.<\/li>\n<li>The point will join that centroid that has a lesser distance from it.<\/li>\n<li>After the points are now separated by the criteria of distance compared with the clusters, the visualization would look like this:<\/li>\n<\/ul>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image04.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80120\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image04.jpg\" alt=\"k means cluster\" width=\"800\" height=\"628\" \/><\/a><\/p>\n<p>We have the points well arranged around their respective clusters.<\/p>\n<ul>\n<li>Now, since the centroids were just random points at first, now they need to position themselves in a certain manner in which they behave like actual centroids.<\/li>\n<li>For this, the algorithm will have to reposition the locations of the centroids by again calculating the distances with the points and forming a well-balanced structure.<\/li>\n<li>The centroids will now look something like this, shown in the image below.<\/li>\n<li>Notice that with each iteration, the centroid calculates the distance between itself and the points and slowly maneuvers its way into a much more stable position.<\/li>\n<li>The images are still but it is just to show how the centroids shift their positions.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image05.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80121\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image05.jpg\" alt=\"k- means cluster\" width=\"800\" height=\"628\" \/><\/a><\/p>\n<p>Compare the previous image with this one. If you look carefully, the centroids have now shifted.<\/p>\n<ul>\n<li>The distance calculation and repositioning of centroids and cluster will take place until we receive a stable structure.<\/li>\n<li>The entire process starting from the distance comparison up to the readjusting of clusters everything is an iterative process.<\/li>\n<li>If the final product is an unstable structure, it would again start from the distance calculation phase.<\/li>\n<li>When we achieve stability, this would mean that the clusters have formed and have stable centroids now.<\/li>\n<li>Also, we have some stopping criteria for the k-means clustering algorithm.<\/li>\n<li>The stopping criteria are very essential to determine if the result has been achieved or to stop the algorithm. These are:<\/li>\n<\/ul>\n<blockquote>\n<ol>\n<li>Centroids have stabilized and do not require a change in their value. This means that the clusters have become stable too.<\/li>\n<li>The points in one cluster remain in the same even after many iterations. This means that they are now part of a stable cluster. If they move to another cluster, this would mean instability.<\/li>\n<li>The last one is if the number of iterations is completed. We can pre-set the number of iterations as let\u2019s say 50 or 100 (any number). If it\u2019s completed then the program stops.<\/li>\n<\/ol>\n<\/blockquote>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image06.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80122\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image06.jpg\" alt=\"k- means\" width=\"800\" height=\"628\" \/><\/a><\/p>\n<p>The final stable version would look something like this. Now that we have seen the explanation for this basic concept, let us now dive deeper. Let\u2019s try to understand some side concepts that are necessary to know.<\/p>\n<h3>Special Properties of Clusters in Machine Learning<\/h3>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/Special-properties-of-Clusters-in-ML.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80131\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/Special-properties-of-Clusters-in-ML.jpg\" alt=\"Special properties of Clusters in ML\" width=\"1000\" height=\"500\" \/><\/a><\/p>\n<h4>1. Inertia<\/h4>\n<p>Inertia is the intra-cluster distance that we calculate. The measurement of the inertia is very significant in the formation of a cluster because it will help us to improve the <strong>stability<\/strong> of the cluster. The closer the points are to the centroid area, the better and the cluster will be much more stable.<\/p>\n<p>So, the conclusion is that the distance between the points within a cluster must be minimum for the cluster to be stable. So, the value of inertia should also be less.<\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image07.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80123\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image07.jpg\" alt=\"Inertia in ML\" width=\"850\" height=\"628\" \/><\/a><\/p>\n<p>Here, we calculate the intra-cluster distance that is the sum of the distance of all the individual points with the centroid of the cluster. The centroid is the red point and the rest all are the green points.<\/p>\n<h4>2. Dunn Index in Machine Learning<\/h4>\n<p>We all saw that inertia is about the points within the cluster. But what about the nearby clusters? Even though inertia could help in creating stable clusters but it doesn\u2019t work on separating two different clusters.<\/p>\n<p>This is where the Dunn index comes into play. Dunn index makes sure both of these aspects are properly followed to achieve a <strong>stable<\/strong> cluster.<\/p>\n<p style=\"text-align: left\"><strong>Dunn Index = (minimum distance between two clusters)\/(maximum distance of points within the cluster)<\/strong><\/p>\n<p>So basically, this is the distance between two clusters divided by <strong>inertia<\/strong>.<\/p>\n<p>The key point to note here is that the Dunn index should be as high as possible for the clusters to be stable. The inertia has to be minimum and the inter-cluster distance has to be high.<\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image08.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80124\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image08.jpg\" alt=\"Dunn Index in ML\" width=\"900\" height=\"628\" \/><\/a><\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image09.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80125\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image09.jpg\" alt=\"Dunn Index in Machine Learning\" width=\"800\" height=\"768\" \/><\/a><\/p>\n<p>Here, we calculate the inter-cluster distance that is the distance between the two centroids of the clusters. We also calculate the intra-cluster distance here as the Dunn index makes sure both are followed to ensure a stable cluster.<\/p>\n<h4>Distance Metrics in Machine Learning<\/h4>\n<p>This mathematical concept helps us to measure the distance between the <strong>centroids<\/strong> and the <strong>data points<\/strong>. We have several types of distance calculation methods.<\/p>\n<p>But, for k-means clustering, we will look at four specific types.<\/p>\n<p><strong>1. Euclidean distance:<\/strong> We use this method to measure the distance between two <strong>integer<\/strong> or<strong> floating points<\/strong> (real-valued points).<\/p>\n<p>We calculate this distance over a straight line as it is the square root of the summation of the square of distances. To put it in simpler terms, let\u2019s see the formula.<\/p>\n<p><strong>d = \u221a(\u2211(B-A)2<\/strong><\/p>\n<p>Here, the summation is from i= 1 to n. Also, B and A are the points.<\/p>\n<ul>\n<li>Squared Euclidean distance: This distance is just like the normal Euclidean except for the square-root.<\/li>\n<\/ul>\n<p><strong>d = \u2211(B-A)2<\/strong><\/p>\n<p><strong>2. Manhattan distance:<\/strong> The Manhattan distance is the sum of the difference between the <strong>coordinates<\/strong> of the points. We can say that it is the sum of the vertical and the horizontal components.<\/p>\n<p>Also, we can say that it is the distance between the points that we measure along the axes ( at 90 degrees).<\/p>\n<p><strong>d = |A1 \u2013 B1| + |A2 \u2013 B2|<\/strong><\/p>\n<p>This is the case for two points. But, for an m number of points, we have a generalized version.<\/p>\n<p><strong>d = \u2211|Ai &#8211; Bi|<\/strong><\/p>\n<p>This is also from i=1 to m.<\/p>\n<p><strong>3. Minkowski distance:<\/strong> We use this as the generalization between both Euclidean and Manhattan distances. This formula can handle m number of points and at the same time, it\u2019s also multi-dimensional.<\/p>\n<p><strong>d = (\u2211|Ai &#8211; Bi|P)1\/P<\/strong><\/p>\n<p>Here, p is the order.<\/p>\n<p>Well, so now, let\u2019s discuss another version of k-means clustering but this time we will be finding the value of k using a new technique known as the <strong>Elbow point technique.<\/strong><\/p>\n<h4>Elbow-Point Technique in Machine Learning<\/h4>\n<p>This technique is a very calculative way to find the optimal (best) value of k. For a long time, we have used the normal method of assigning the value of k beforehand that is just a trial and error type of technique. But, here we calculate the value of k via a graph plot.<\/p>\n<p>Once, we get the value of \u2018k\u2019, the procedure is quite similar to the first technique. So, here, for each value of \u2018k\u2019, the algorithm gives, that many clusters will form.<\/p>\n<p>Let\u2019s say if k=1 then 1 cluster will form. For k=2, 2 clusters will form, and so on.<\/p>\n<p>Now, the main concept is that we have to calculate the square of the <strong>summation<\/strong> of the intra-cluster distance.<\/p>\n<p>Let\u2019s say that if in a cluster, we have 10 points, then it goes like the square of the distance between these individual points and the centroid and then the summation of all these is the result we need. The result has another name here known as the cost (Squared-sum error).<\/p>\n<p>For each value of \u2018k\u2019, we observe the distortion or deviation of the graph. For, more value of \u2018k\u2019 we would have more centroids and lesser the cost.<\/p>\n<p>So, the point where the graph takes a sharp or significant change in direction will be the elbow point. This value of elbow point will be our optimal value of \u2018k\u2019.<\/p>\n<p>The sum of squared distance here means the distance between the data point and the point that is the nearest to the centroid (or centroid as assumed in many places).<\/p>\n<p><strong>S = \u2211(Xi-Ci)2<\/strong><\/p>\n<p>Here, the summation is from i=1 to m. To understand it better, let\u2019s see how the cost vs value of the \u2018k\u2019 graph looks like.<\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image10.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80126\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image10.jpg\" alt=\"Squared sum error in ML\" width=\"950\" height=\"650\" \/><\/a><\/p>\n<p>After finding this value of \u2018k\u2019, the rest of the clustering procedure is the same.<\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image11.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80127\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/k-means-image11.jpg\" alt=\"k-means clustering in ML\" width=\"1250\" height=\"500\" \/><\/a><\/p>\n<p>The above image is a brief explanation of how the procedure follows when the elbow point technique comes into play.<\/p>\n<h3>Advantages of K-means Clustering in ML<\/h3>\n<ul>\n<li>It works well with large datasets and it\u2019s very easy to implement.<\/li>\n<li>In clustering, especially in K-means, we have the benefit of having a convergence stage in the final as it\u2019s a good indicator of stable clusters. The program stops when the best result comes out.<\/li>\n<li>We can use numerous examples as data in it. It is a very adaptive type of algorithm.<\/li>\n<li>It can create clusters of a variety of shapes that gives much broader importance to the data visualization part.<\/li>\n<li>The clusters of k-means do not overlap with each other as they prove to be non-hierarchical.<\/li>\n<li>K-means is faster than hierarchical clustering.<\/li>\n<li>The clusters produced can be a lot dense and tighter than hierarchical clustering due to the presence of globular clusters.<\/li>\n<\/ul>\n<h3>Disadvantages of K-means Clustering in machine Learning<\/h3>\n<ul>\n<li>We need to choose the value of \u2018k\u2019 by ourselves. Or we can use a longer method like the elbow point method. But, it would still take time.<\/li>\n<li>It\u2019s hard to cluster for varying density and size.<\/li>\n<li>Outliers can cause problems for the position of the centroid. If an outlier is in the cluster, it would alter the centroid position<\/li>\n<li>This isn\u2019t suitable for a non-convex shaped cluster.<\/li>\n<\/ul>\n<h3>K-Means Clustering Applications<\/h3>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/Application-of-K-means-Clustering-in-ML.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-80136\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2020\/10\/Application-of-K-means-Clustering-in-ML.jpg\" alt=\"Applications of K-means Clustering in ML\" width=\"1000\" height=\"700\" \/><\/a><\/p>\n<p>We have various interesting areas of application of K-means Clustering.<\/p>\n<p><strong>1. Insurance Fraud Detection<\/strong><\/p>\n<p>We make use of past data and also we draw various patterns to make sure that if any new case comes, it would definitely incline towards any specific cluster that the algorithms have created. It\u2019s an important thing to do as it can really help us to <strong>reduce<\/strong> insurance fraud.<\/p>\n<p><strong>2. Search Engines<\/strong><\/p>\n<p>The algorithm helps to form various clusters of possible results. If any search happens using the search engine, the result will be from a particular cluster, as search engines have a huge collection of clusters.<\/p>\n<p>The query would incline to a particular one prompting the algorithm to give out the accurate result.<\/p>\n<p><strong>3. Customer Segmentation<\/strong><\/p>\n<p>Companies find this very beneficial for improving their customer base. They would generally create various clusters of types of customers and would specifically target a few of them to improve their campaign.<\/p>\n<p><strong>4. Crime Hot-Spot Detection<\/strong><\/p>\n<p>This is a useful one for any city\u2019s police department. Using this model, they would identify specific areas with a high frequency of crimes and can gain a lot of information from this.<\/p>\n<p><strong>5. Diagnostic Systems<\/strong><\/p>\n<p>In the medical profession, professionals deal with various cases of ailments, both severe and normal. This clustering algorithm can help them in various ways like, detect ailments by taking in data of symptoms. They can also help in being smart support systems and help in making better decisions.<\/p>\n<h3>Conclusion<\/h3>\n<p>So, we had an in-depth discussion in the middle section, where we saw the working of the k-means clustering algorithm. We learned some new stuff like the elbow point technique that were newer ways to calculate \u2018k\u2019. This article was also about understanding the <strong>inner concepts<\/strong> of the algorithm.<\/p>\n<p>We also looked at some pros and cons of k-means. And finally, tried to wrap the topic with some diverse applications that we can learn about.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this article, we will specifically focus on one of the popular algorithms of clustering i.e., K-means clustering. This is one of the most used clustering algorithms. Here, we will mainly look at how&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":80130,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[210],"tags":[3316,3317,3318],"class_list":["post-80080","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-k-means-clustering","tag-k-means-clustering-in-machine-learning","tag-k-means-clustering-in-ml"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>K-Means Clustering in Machine Learning - TechVidvan<\/title>\n<meta name=\"description\" content=\"K-means clustering is most popular unsupervised machine learning algorithms. It computes centroids &amp; iterates until it finds optimal centroid\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"K-Means Clustering in Machine Learning - TechVidvan\" \/>\n<meta property=\"og:description\" content=\"K-means clustering is most popular unsupervised machine learning algorithms. It computes centroids &amp; iterates until it finds optimal centroid\" \/>\n<meta property=\"og:url\" content=\"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/\" \/>\n<meta property=\"og:site_name\" content=\"TechVidvan\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/TechVidvan\/\" \/>\n<meta property=\"article:published_time\" content=\"2020-10-26T03:30:55+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/10\/K-Means-Clustering-in-Machine-Learning.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:site\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"13 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"K-Means Clustering in Machine Learning - TechVidvan","description":"K-means clustering is most popular unsupervised machine learning algorithms. It computes centroids & iterates until it finds optimal centroid","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/","og_locale":"en_US","og_type":"article","og_title":"K-Means Clustering in Machine Learning - TechVidvan","og_description":"K-means clustering is most popular unsupervised machine learning algorithms. It computes centroids & iterates until it finds optimal centroid","og_url":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/","og_site_name":"TechVidvan","article_publisher":"https:\/\/www.facebook.com\/TechVidvan\/","article_published_time":"2020-10-26T03:30:55+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/10\/K-Means-Clustering-in-Machine-Learning.jpg","type":"image\/jpeg"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@vidvantech","twitter_site":"@vidvantech","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"13 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/#article","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22"},"headline":"K-Means Clustering in Machine Learning","datePublished":"2020-10-26T03:30:55+00:00","mainEntityOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/"},"wordCount":2373,"commentCount":0,"publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/10\/K-Means-Clustering-in-Machine-Learning.jpg","keywords":["K-means Clustering","K-Means Clustering in Machine learning","K-Means Clustering in ML"],"articleSection":["Machine Learning Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/","url":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/","name":"K-Means Clustering in Machine Learning - TechVidvan","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/#website"},"primaryImageOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/#primaryimage"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/10\/K-Means-Clustering-in-Machine-Learning.jpg","datePublished":"2020-10-26T03:30:55+00:00","description":"K-means clustering is most popular unsupervised machine learning algorithms. It computes centroids & iterates until it finds optimal centroid","breadcrumb":{"@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/#primaryimage","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/10\/K-Means-Clustering-in-Machine-Learning.jpg","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/10\/K-Means-Clustering-in-Machine-Learning.jpg","width":1200,"height":628,"caption":"K-Means Clustering in Machine Learning"},{"@type":"BreadcrumbList","@id":"https:\/\/techvidvan.com\/tutorials\/machine-learning-k-means-clustering\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/techvidvan.com\/tutorials\/"},{"@type":"ListItem","position":2,"name":"K-Means Clustering in Machine Learning"}]},{"@type":"WebSite","@id":"https:\/\/techvidvan.com\/tutorials\/#website","url":"https:\/\/techvidvan.com\/tutorials\/","name":"TechVidvan Blogs","description":"","publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/techvidvan.com\/tutorials\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/techvidvan.com\/tutorials\/#organization","name":"TechVidvan","url":"https:\/\/techvidvan.com\/tutorials\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","width":200,"height":50,"caption":"TechVidvan"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/TechVidvan\/","https:\/\/x.com\/vidvantech"]},{"@type":"Person","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22","name":"TechVidvan Team","description":"The TechVidvan Team delivers practical, beginner-friendly tutorials on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our experts are here to help you upskill and excel in today\u2019s tech industry."}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/80080","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/comments?post=80080"}],"version-history":[{"count":0,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/80080\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media\/80130"}],"wp:attachment":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media?parent=80080"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/categories?post=80080"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/tags?post=80080"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}