informatica online training

Posted by Harika on February 27th, 2021

Power BI is a patented product of Microsoft for conducting business intelligence activities. After 2018, Power BI has made it easy to combine statistical and general-purpose applications like R and Python.

How is it benefiting you? 

You have to focus on the data analysis team if you are a business intelligence (BI) specialist and choose to undertake those data science activities. On the other hand, to display their analysis in a presentable manner, maybe in a dashboard, a Python developer needs to focus on the BI team. However, since you can now run Python inside an optimized environment, Power BI gets rid of this co-dependence.

The data scientist of the modern era needs to grasp the full pipeline of solving complicated market issues. This typically entails collecting, washing, exploring, and converting data to make forecasts for future events. Finally, in a dashboard or a paper, the study is portrayed. Thus, the conventional functions of a specialist and predictive analytics expert in business intelligence are becoming blurred.

This improves the need for a robust tool that in the analytical pipeline can execute all the tasks listed above. If this tool comes from the Excel developers, there's nothing like it. Power BI is therefore the town's latest talk of doing embedded analytics.

Integrated Environment Configuration

Having an integrated environment up and running is the first step. You should have a Python distribution built on your computer to do this. I recommend the baseline distribution of Python for this reason. I use Anaconda for all my coding based assignments. Nevertheless, it can be a dynamic exercise to combine Anaconda with Power BI.

Post-installation, you need to install four Python packages in an integrated environment. They are Pandas (for manipulation and interpretation of information), Matplotlib and Seaborn (for plotting), and Numpy (for scientific calculations).

To install these packages, you can use the pip command in your command-line tool.

pip install pandas

pip install matplotlib

pip install NumPy

pip install seaborn

We have to allow Python Scripting in Power BI after downloading these packages. You can open Power BI to see if the Python distribution installed on your computer detects it automatically. Go to Directories -> Options and Setup -> Options. You can see the home directory for Python built on your computer under Python Scripting.

Use Python script to import data

Now, to verify if Python works inside the Power BI stack, you can run a short test. To begin with, using a Python script, you can import a small dataset into Power BI.

Go to the Home Ribbon, for this reason, click Get Info, and then select Other. This section helps you, apart from using scripts like R or Python, to import data from a varied list of sources, in particular the Network, Hadoop Distributed File System (HDFS), Spark, etc. Here, the Churn Forecast dataset that is stored on my desktop will be imported.

Integrating Python with Power BI: Get Data in Power BI

Only press Link. This will open up a segment where the following Python script can be written:

Integrating Python with Power BI: Script from Python

If you click OK, the Navigator will be loaded and you will be prompted to pick the churn data and then click Load. To verify if the data has been loaded, you should go to the data view. Now, to execute one-click data transformations, you are ready to use Power Query.

Transforming data by using Power Query

The fact that transforming data is more or less a basic task is known by those who have hiked the Python learning curve, but it might not be as straightforward for a person just going for their data science journey.

With the Power Query Editor, however, we can shape and turn information with a single click. Not just that, but Power BI still holds a log before the review of all the activities that go through the data transformation pipeline. To show the basic functionality of data transformation, we will demonstrate how to use Power Query.

Once you have loaded the data into the Power BI, under the Home tab, press Transform Data to open the Query Editor.

This opens the Query Editor and offers you a variety of options for data cleaning, reshaping, and conversion.

As these reflect the Customer Net Worth Category, we transform the customer nw category attribute into a text field and it should not be used as a constant variable.

We'll pick a column to do this, go to Data Form, and change the data type to Text. Under the Applied Steps portion, Power Query records this step. This move, for quick recall, is a good practice to rename. We're going to rename it "nw cat Text." Likewise, the churn column will be transformed into a logical vector, representing True for 1 (churned) and False for 0 (not churned) and renaming the move to True/False.

To add these transformations to the details, click on Close & Apply (on the top left corner) after you have completed the transformation stage.

Use statistics from Python inside Power BI

While Power BI has a robust visualization library, constructing a correlation matrix inside it is not a trivial matter. But, an integral part of data analysis papers is the correlation matrix heatmap.

In this part, we will show how to use Python's correlation function to construct a correlation matrix heatmap. This heatmap will be shown under Power BI in the Report portion.

In Power BI, head over to the Report section, and under the Visualizations section, press the Python visual denoted by the Py symbol. You will find an empty Python graphic emerging on the left and a Python Editor script showing up at the bottom. In other words, the option to construct visualizations with scripts is provided by Power BI.

You'll find that the Values field is currently empty.

We will get all the continuous variables into the Values area to explain the correlation heatmap, namely, age, all average monthly balance columns, etc. Moreover, there are current and previous month balance and current and previous month transaction columns, a range of dependents, and vintage (the time of association). This is a significant move. Otherwise, these variables will not be known by Power BI to form part of the visualization.

The Python script will be automatically filled with the following codes when we get the variables into the Values field.

# The following code is often executed to create a data frame and erase redundant rows and serves as a preamble for your script.

DataFrame(age, average-monthly-balance-prevQ, average-monthly-balance-prevQ2, current-balance, present-month-balance, current-month-credit, current-month-debit, contingent, preceding-month-balance, preceding-month-credit, preceding-month-end-balance, preceding-month-debit, vintage)

# dataset = dataset.drop duplicates # dataset.drop duplicates ()

# Paste the script code or type it here.

Analytical reports production

We may infer after evaluating the heatmap:

For all the employers,

There is no connection between age and number of dependents and the other variables

In the last two trimesters, the average monthly balance is mildly correlated.

The monthly average balance in the last quarter is closely associated with the balance of the current month and the balance of the previous month.

For consumers who have churned and comparing it with those who have not, we can build this heatmap. We then add a churn= Truth or False filter using the blue boxes to independently observe the heatmap for the two consumer classes.

For customers who have not churned, the below chart reflects the picture. For these two groups of clients, though, a common narrative unfolds. There is a much stronger link between the average monthly balance in the last two years and the present and previous months' balance with consumers that have not churned.

Not Churn

Notes at Finish

We heard about incorporating Python inside the Power BI distribution in this article. To create an analytical article, we used Power BI's reporting capabilities along with Python's analytical capabilities.

In conclusion, this interconnected ecosystem provides data scientists and business intelligence experts with more strength. They will quickly capitalize on both of these instruments' advantageous aspects. Learn more on this segment through Power BI Online Training.

Like it? Share it!


Harika

About the Author

Harika
Joined: February 25th, 2021
Articles Posted: 13

More by this author