Journey Through Libraries of Python
Course Lessons
S.No | Lesson Title |
---|---|
1 | Introduction |
2 | Why libraries are heavily used in Python |
3 |
Some Popular Libraries |
3.1 | Numpy |
3.2 | Scikit-learn |
3.3 | Pandas |
3.4 | Keras |
3.5 | Matplotlib |
3.6 | Tensorflow |
3.7 | PyTorch |
4 | Ways of Installing Libraries |
5 | Conclusion |
Introduction
Libraries in python are an integral part of the language. We use many different libraries for different tasks ranging from plotting curves to solving complex mathematical equations. The idea of libraries in python has given us the freedom to perform lots of different tasks without writing our own code from scratch. For example, if you are working with numerical data and you have to do complex calculations then there are libraries such as NumPy, pandas, scipy, etc. that can do these calculations for you. In this article, we'll talk about the significance of libraries in python and why they became so popular.
Why libraries are heavily used in Python
The common definition of a python library is that it consists of reusable chunks of codes that can be included and used as part of other programs/projects. A library is a collection of many modules where each module contains few lines of code for specific tasks. There are many reasons that these collections of codes or packages are used heavily while writing codes. Let's discuss a few reasons for the same.
The biggest advantage of python libraries is the fact that they are open-sourced and constantly developed by a large community of python developers. This ensures that all the issues related to the use of libraries or any bugs in them are resolved efficiently. Another advantage is the availability of multiple libraries for similar tasks which leaves us with multiple options to explore as per the task at hand.
The idea of writing a piece of code and then making it reusable for tasks up ahead can be difficult as there are constant changes needed. What if someone has already taken care of all these changes and the code at the same time? This is one more advantage of using libraries as they save our time from writing chunks of code and optimizing them to achieve the best results. This speeds up the progress of any project.
Some Popular Libraries
There are many popular libraries in python that are used for different tasks such as plotting, data handling, generating numerical and analytical solutions, machine learning etc. Those who frequently do plotting and exploratory data analysis related tasks will have plotting libraries like matplotlib, seaborn, plotly at the top of their list of most used libraries. Those who work as data scientists will have sklearn, tensorflow, keras, etc at the top of their list of used libraries. Similarly are many other libraries which are frequently used and are quite popular because of their ease of use and large number of functions available for different tasks. In this section, we'll discuss these top-priority libraries and will also try to gather insight about how they are used for different computations.
Numpy
This library is used by many other machine learning libraries to perform various numeric operations for different algorithms beneath the hood. This makes it one of the most used libraries in python. The most important feature of this library is the wide range of mathematical operations available for arrays and matrices.
Scikit-learn
It is one of the most popular libraries used for machine learning. It has different algorithms which can be directly used for addressing data science-related tasks. All the algorithms and modules are easy to use which is really helpful in structuring a project without coding anything from scratch. The simple interface and user guide available with explanations of many algorithms is what makes it a must-know for anyone interested in data science.
Pandas
This name pops up whenever it comes to data manipulation using python. It provides data structures of high level and a wide variety of tools for data analysis. It has a lot of features for grouping, combining data and filtering, as well as time series-related functionality. The user guide is good and the community support for pandas is really large making it easier to quickly resolve issues while working with it.
Keras
This is a relatively new library for neural networks. It is easier to use compared to many other neural network libraries as it provides inbuilt functions and pre-trained models with a simple interface. It is built on top of tensorflow and gets rid of direct handling of computational graph-related functionality that is there in tensorflow. The fact that it creates the computational graph using the back-end infrastructure slows it down when compared to other deep learning libraries. The modular structure of numerous implementations of layers, objectives, activations functions makes it much easier to use.
Matplotlib
Matplotlib was developed by John D. hunter. The library was built on top of NumPy arrays and works with a broader scipy stack. It is a low-level graph plotting library in python that is open source and constantly under development. It is easy to use and the fact that it has been around for two decades has helped it in establishing a large community of users and developers. It covers almost all the plots that one can think of with detailed examples in the user guides.
Tensorflow
Tensorflow is an open source machine learning framework developed by google. It is flexible and provides support for model deployment on different platforms. Tensorflow is highly modular compared to keras which makes it popular among those who want custom machine learning/deep learning models to be built from scratch. Another important aspect of tensorflow is the user guide and a large number of tutorials present on the official website for different applications. All these make it easier for a beginner to learn tensorflow from scratch.
PyTorch
PyTorch is another popular deep learning library commonly used for computer vision and natural language processing tasks. It was developed by facebook and offers high modularity just like tensorflow.
Ways of installing Libraries
The easiest way of using these libraries if you are an aspiring data scientist is by downloading them with the anaconda distribution that is freely available. You can use pip from the anaconda command prompt or the terminal of your PC to directly download the required libraries. Pip is a package management system written in python which is used to install and manage packages. Use the following lines of code to install packages using pip/conda in windows.
#can be used with anaconda or with windows terminal
pip install library_name
#use this command if you don't have pip with anaconda
conda install pip
#you can also use conda without pip in anaconda prompt
conda install pandas
For Linux (Ubuntu) users the process is very similar and instead of pip, you'll use the Sudo command. Following is the code for downloading packages on Linux. This command can change from package to package.
sudo apt-get install python3-package_name
Conclusion
There are a large number of packages in python and there are many more under development. These packages have made python an easy-to-use language with more functionalities in fewer lines of code. You can also develop your own package and release it if you want to contribute to the ever-growing community. Happy learning!
References
- https://numpy.org/
- https://www.scipy.org/
- https://pandas.pydata.org
- https://scikit-learn.org/stable/
- https://keras.io/
- https://matplotlib.org/
- https://www.tensorflow.org/guide
- https://pytorch.org/