Machine Learning 101: Supervised vs Unsupervised Learning

Deep Learning

Nov 17, 2023 | By Ananya Chakraborty

Machine Learning 101: Supervised vs Unsupervised Learning

But, have you wondered how these systems operate? If your answer is yes, then you have come to the right place.


In today's article on Machine Learning 101, we will provide a comprehensive overview explaining the core differences between the two approaches- supervised and unsupervised learning, algorithms used, highlight the challenges encountered, and see them in action in real-world applications.


At its essence, it’s a straightforward answer- while one relies on labeled data to make predictions, other works on unlabeled data.


Here’s what we’ll cover:


  • What is Supervised Learning?

  • What is Unsupervised Learning?

  • Supervised Learning vs. Unsupervised Learning: Key differences

  • Real-World Applications

  • Key Takeaways and Conclusion

    What is Supervised Learning?

    Simply put, Supervised learning is like teaching a child using a labeled picture book of animals and plants. In supervised learning, the training data called labels is fed to the algorithm including the desired solutions. The algorithm then learns from labeled data and makes predictions or decisions based on it.

    Key Characteristics of Supervised Learning:

    • Training data is labeled, i.e., it includes the solutions/outputs.

    • Problems addressed: Classification, regression, prediction.

    • Algorithms learn patterns that connect inputs to outputs.

    • The goal is to predict the right outputs for the unseen data accurately.

    • Examples of algorithms: Linear regression, random forest, SVM, and neural networks.

      Types of Unsupervised Learning:

      • Clustering:

        • K-means Clustering:  Groups data into a 'K' number of clusters.

        • Hierarchical Clustering: Groups data into a tree of clusters.

      • Association Rule Learning:

        • Apriori Algorithm: Identifies frequently occurring items in a dataset..

        • Eclat Algorithm: A faster alternative to Apriori for finding frequent itemsets.

      • Dimensionality Reduction:

        • Principal Component Analysis (PCA): It reduces the dimensionality of the data by selecting the most essential features.

      • Anomaly Detection:

        • Identifies unusual or rare items in a dataset (e.g., fraud detection)

      • Neural Networks:

        • Self-Organizing Maps (SOMs): Used for reducing dimensionality and visualizing data.

        • Deep Belief Networks (DBNs): A generative model that can generate new data similar to the input data

      • Recommendation Systems:

        • Collaborative Filtering: Recommends items by comparing the user's profile to other profiles..

        • Content-Based Filtering: Recommends items by comparing the content of the items to a user's profile.

      Steps in Unsupervised Learning:

      1. Data preprocessing - collect unlabeled data, clean, preprocess

      2. Determine the number of clusters or latent factors

      3. Model estimation - train unsupervised algorithm on data

      4. Analyze model results qualitatively and quantitatively

      5. Iterate with different hyperparameters for optimization


      Challenges:

      1. Optimal Number of Clusters: Determining the right number of clusters is significant.

      2. High Dimensionality: Handling high dimensional data.

      3. Computational Complexity: Managing computational resources.

      4. Noisy Results: They can significantly affect the model's performance.


      Examples of Unsupervised Learning Algorithms:

      1. K-means example: Segmenting customers into clusters based on attributes like age, income, spending patterns, etc.

      2. Hierarchical clustering example: Clustering genes into groups based on similarity in expression patterns.

      3. PCA example: Reducing the dimensionality of high-dimensional facial image data to 2D for visualization.

      4. Association rule learning example: Analyzing shopping baskets to determine which products are frequently purchased together.