Planet Python

Last update: June 06, 2020 04:47 AM UTC

June 05, 2020

Stack Abuse

Reading and Writing Excel (XLSX) Files in Python with the Pandas Library

Introduction

Just like with all other types of files, you can use the Pandas library to read and write Excel files using Python as well. In this short tutorial, we are going to discuss how to read and write Excel files via DataFrames.

In addition to simple reading and writing, we will also learn how to write multiple DataFrames into an Excel file, how to read specific rows and columns from a spreadsheet, and how to name single and multiple sheets within a file before doing anything.

If you'd like to learn more about other file types, we've got you covered:

Reading and Writing Excel Files in Python with Pandas

Naturally, to use Pandas, we first have to install it. The easiest method to install it is via pip.

If you're running Windows:

$ python pip install pandas

If you're using Linux or MacOS:

$ pip install pandas

Note that you may get a ModuleNotFoundError or ImportError error when running the code in this article. For example:

ModuleNotFoundError: No module named 'openpyxl'

If this is the case, then you'll need to install the missing module(s):

$ pip install openpyxl xlsxwriter xlrd

Writing Excel Files Using Pandas

We'll be storing the information we'd like to write to an Excel file in a DataFrame. Using the built-in to_excel() function, we can extract this information into an Excel file.

First, let's import the Pandas module:

import pandas as pd

Now, let's use a dictionary to populate a DataFrame:

df = pd.DataFrame({'States':['California', 'Florida', 'Montana', 'Colorodo', 'Washington', 'Virginia'],
    'Capitals':['Sacramento', 'Tallahassee', 'Helena', 'Denver', 'Olympia', 'Richmond'],
    'Population':['508529', '193551', '32315', '619968', '52555', '227032']})

The keys in our dictionary will serve as column names. Similarly, the values become the rows containing the information.

Now, we can use the to_excel() function to write the contents to a file. The only argument is the file path:

df.to_excel('./states.xlsx')

Here's the Excel file that was created:

states spreadsheet

Please note that we are not using any parameters in our example. Therefore, the sheet within the file retains its default name - "Sheet 1". As you can see, our Excel file has an additional column containing numbers. These numbers are the indices for each row, coming straight from the Pandas DataFrame.

We can change the name of our sheet by adding the sheet_name parameter to our to_excel() call:

df.to_excel('./states.xlsx', sheet_name='States')

Similarly, adding the index parameter and setting it to False will remove the index column from the output:

df.to_excel('./states.xlsx', sheet_name='States', index=False)

Now, the Excel file looks like this:

states spreadsheet without index

Writing Multiple DataFrames to an Excel File

It is also possible to write multiple dataframes to an Excel file. If you'd like to, you can set a different sheet for each dataframe as well:

income1 = pd.DataFrame({'Names': ['Stephen', 'Camilla', 'Tom'],
                   'Salary':[100000, 70000, 60000]})

income2 = pd.DataFrame({'Names': ['Pete', 'April', 'Marty'],
                   'Salary':[120000, 110000, 50000]})

income3 = pd.DataFrame({'Names': ['Victor', 'Victoria', 'Jennifer'],
                   'Salary':[75000, 90000, 40000]})

income_sheets = {'Group1': income1, 'Group2': income2, 'Group3': income3}
writer = pd.ExcelWriter('./income.xlsx', engine='xlsxwriter')

for sheet_name in income_sheets.keys():
    income_sheets[sheet_name].to_excel(writer, sheet_name=sheet_name, index=False)

writer.save()

Here, we've created 3 different dataframes containing various names of employees and their salaries as data. Each of these dataframes is populated by its respective dictionary.

We've combined these three within the income_sheets variable, where each key is the sheet name, and each value is the DataFrame object.

Finally, we've used the xlsxwriter engine to create a writer object. This object is passed to the to_excel() function call.

Before we even write anything, we loop through the keys of income and for each key, write the content to the respective sheet name.

Here is the generated file:

multi-sheet excel file

You can see that the Excel file has three different sheets named Group1, Group2, and Group3. Each of these sheets contains names of employees and their salaries with respect to the date in the three different dataframes in our code.

The engine parameter in the to_excel() function is used to specify which underlying module is used by the Pandas library to create the Excel file. In our case, the xlsxwriter module is used as the engine for the ExcelWriter class. Different engines can be specified depending on their respective features.

Depending upon the Python modules installed on your system, the other options for the engine attribute are: openpyxl (for xlsx and xlsm), and xlwt (for xls).

Further details of using the xlsxwriter module with Pandas library are available at the official documentation.

Last but not least, in the code above we have to explicitly save the file using writer.save(), otherwise it won't be persisted on the disk.

Reading Excel Files with Pandas

In contrast to writing DataFrame objects to an Excel file, we can do the opposite by reading Excel files into DataFrames. Packing the contents of an Excel file into a DataFrame is as easy as calling the read_excel() function:

students_grades = pd.read_excel('./grades.xlsx')
students_grades.head()

For this example, we're reading this Excel file.

Here, the only required argument is the path to the Excel file. The contents are read and packed into a DataFrame, which we can then preview via the head() function.

Note: Using this method, although the simplest one, will only read the first sheet.

Let's take a look at the output of the head() function:

grades dataframe

Pandas assigns a row label or numeric index to the DataFrame by default when we use the read_excel() function.

We can override the default index by passing one of the columns in Excel file column as the index_col parameter:

students_grades = pd.read_excel('./grades.xlsx', sheet_names='Grades', index_col='Grade')
students_grades.head()

Running this code will result in:

grades index

In the example above, we have replaced the default index with the "Grade" column from the Excel file. However, you should only override the default index if you have a column with values that could serve as a better index.

Reading Specific Columns from an Excel File

Reading a file in its entirety is useful, though in many cases, you'd really want to access a certain element. For example, you might want to read the element's value and assign it to a field of an object.

Again, this is done using the read_excel() function, though, we'll be passing the usecols parameter. For example, we can limit the function to only read certain columns. Let's add the parameter so that we read the columns that correspond to the "Student Name", "Grade" and "Marks Obtained" values.

We do this by specifying the numeric index of each column:

cols = [0, 1, 3]

students_grades = pd.read_excel('./grades.xlsx', usecols=cols)
students_grades.head()

Running this code will yield:

dataframe usecols

As you can see, we are only retrieving the columns specified in the cols list.

Conclusion

We've covered some general usage of the read_excel() and to_excel() functions of the Pandas library. With them, we've read existing Excel files and written our own data to them.

Using various parameters, we can alter the behavior of these functions, allowing us to build customized files, rather than just dumping everything from a DataFrame.

June 05, 2020 02:06 PM UTC

PyCharm

Smart execution of R code

R plugin is announcing some helpful features to track execution of your R code:

1. Execute your R file as a runnable process, job. Jobs are shown in a separate tab in the R console. You can preview the job status (succeeded or failed), the duration of the execution, and the time you launched the job.

When starting a new job, you can specify the way you want to process the results of the job execution. You can restrict copying it, copy to the global environment, or copy it into a separate variable. To preview the results, switch to the Variables pane:

2. Try new ways to quickly import data files. You can now download data from CSV, TSV, or XLS files into your global environment:

Once added, the data can be accessed from your R code.

In this release, we also introduced some stability improvements and enhancements for resolving and autocompleting named arguments.

Interested?

Download PyCharm from our website and install the R plugin. See more details and installation instructions in PyCharm documentation.

June 05, 2020 01:55 PM UTC

Catalin George Festila

Python 3.8.3 : Using the fabric python module - part 001.

The tutorial for today is about fabric python module.You can read about this python module on the official webpage.The team comes with this intro:Fabric is a high level Python (2.7, 3.4+) library designed to execute shell commands remotely over SSH, yielding useful Python objects in return...It builds on top of Invoke (subprocess command execution and command-line features) and Paramiko (SSH

June 05, 2020 01:19 PM UTC

Real Python

The Real Python Podcast – Episode #12: Web Scraping in Python: Tools, Techniques, and Legality

Do you want to get started with web scraping using Python? Are you concerned about the potential legal implications? What are the tools required and what are some of the best practices? This week on the show we have Kimberly Fessel to discuss her excellent tutorial created for PyCon 2020 online titled "It's Officially Legal so Let's Scrape the Web."

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 05, 2020 12:00 PM UTC

Doug Hellmann

sphinxcontrib-spelling 5.1.0

sphinxcontrib-spelling is a spelling checker for Sphinx-based documentation. It uses PyEnchant to produce a report showing misspelled words. What’s new in 5.1.0? Add an option to show the line containing a misspelling for context (contributed by Huon Wilson)

June 05, 2020 11:05 AM UTC

Python Bytes

#184 Too many ways to wait with await?

Sponsored by DigitalOcean: <a href="https://pythonbytes.fm/digitalocean">pythonbytes.fm/digitalocean</a> - $100 credit for new users to build something awesome. Michael #1: <a href="https://hynek.me/articles/waiting-in-asyncio/">Waiting in asyncio</a> <ul> <li>by <a href="https://hynek.me/">Hynek Schlawack</a></li> <li>One of the main appeals of using Python’s <code>asyncio</code> is being able to fire off many coroutines and run them concurrently. How many ways do you know for waiting for their results?</li> <li>The simplest case is to await your coroutines:</li> </ul> <pre><code> result_f = await f() result_g = await g() </code></pre> <ul> <li>Drawbacks: <ol> <li>The coroutines do not run concurrently. <code>g</code> only starts executing after <code>f</code> has finished.</li> <li>You can’t cancel them once you started awaiting.</li> </ol></li> <li><code>[asyncio.Task](https://docs.python.org/3/library/asyncio-task.html#asyncio.Task)</code><a href="https://docs.python.org/3/library/asyncio-task.html#asyncio.Task">s</a> wrap your coroutines and get independently scheduled for execution by the event loop whenever you yield control to it</li> </ul> <pre><code> task_f = asyncio.create_task(f()) task_g = asyncio.create_task(g()) await asyncio.sleep(0.1) # <- f() and g() are already running! result_f = await task_f result_g = await task_g </code></pre> <ul> <li>Your tasks now run concurrently and if you decide that you don’t want to wait for <code>task_f</code> or <code>task_g</code> to finish, you can cancel them using <code>task_f.cancel()</code></li> <li><code>[asyncio.gather()](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather)</code> takes 1 or more awaitables as <code>*args</code>, wraps them in tasks if necessary, and waits for all of them to finish. Then it returns the results of all awaitables in the same order</li> </ul> <pre><code> result_f, result_g = await asyncio.gather(f(), g()) </code></pre> <ul> <li><code>[asyncio.wait_for()](https://docs.python.org/3/library/asyncio-task.html#asyncio.wait_for)</code> allows for passing a time out</li> <li>A more elegant approach to timeouts is the <a href="https://pypi.org/project/async-timeout/">async-timeout</a> <a href="https://pypi.org/project/async-timeout/">package</a> on PyPI. It gives you an asynchronous context manager that allows you to apply a total timeout even if you need to execute the coroutines sequentially</li> </ul> <pre><code> async with async_timeout.timeout(5.0): await f() await g() </code></pre> <ul> <li><code>[asyncio.as_completed()](https://docs.python.org/3/library/asyncio-task.html#asyncio.as_completed)</code> takes an iterable of awaitables and returns an iterator that yields <code>[asyncio.Future](https://docs.python.org/3/library/asyncio-future.html#asyncio.Future)</code>s in the order the awaitables are done</li> </ul> <pre><code> for fut in asyncio.as_completed([task_f, task_g], timeout=5.0): try: await fut print("one task down!") except Exception: print("ouch") </code></pre> <ul> <li>Michael’s <a href="http://talkpython.fm/async">Async Python course</a>.</li> </ul> Brian #2: virtualenv is faster than venv <ul> <li><a href="https://virtualenv.pypa.io/en/latest/">virtualenv docs</a>: “<code>virtualenv</code> is a tool to create isolated Python environments. Since Python <code>3.3</code>, a subset of it has been integrated into the standard library under the <a href="https://docs.python.org/3/library/venv.html">venv module</a>. The <code>venv</code> module does not offer all features of this library, to name just a few more prominent: <ul> <li>is slower (by not having the <code>app-data</code> seed method),</li> <li>is not as extendable,</li> <li>cannot create virtual environments for arbitrarily installed python versions (and automatically discover these),</li> <li>is not upgrade-able via <a href="https://pip.pypa.io/en/stable/installing/">pip</a>,</li> <li>does not have as rich programmatic API (describe virtual environments without creating them).”</li> </ul></li> <li>pro: faster: under 0.5 seconds vs about 2.5 seconds</li> <li>con: the <code>--prompt</code> is weird. I like the parens and the space, and 3.9’s magic “.” option for prompt to name it after the current directory.</li> <li>pro: the pip you get in your env is already updated</li> <li>conclusion: <ul> <li>I’m on the fence for my own use. Probably leaning more toward keeping built in. But not having to update pip is nice.</li> <li>For teaching, I’ll stick with the built in <code>venv</code>.</li> <li>The “extendable” and “has an API” parts really don’t matter much to me. </li> </ul></li> </ul> <pre><code> $ time python3.9 -m venv venv --prompt . real 0m2.698s user 0m2.055s sys 0m0.606s $ source venv/bin/activate (try) $ deactivate $ rm -fr venv $ time python3.9 -m virtualenv venv --prompt "(try) " ... real 0m0.384s user 0m0.202s sys 0m0.255s $ source venv/bin/activate (try) $ </code></pre> Michael #3: <a href="https://nullprogram.com/blog/2020/05/24/">Latency in Asynchronous Python</a> <ul> <li>Article by Chris Wellons</li> <li>Was debugging a misbehaving Python program that makes significant use of <a href="https://docs.python.org/3/library/asyncio.html">Python’s asyncio</a>.</li> <li>The program would eventually take very long periods of time to respond to network requests.</li> <li>The program’s author had made a couple of fundamental mistakes using asyncio.</li> <li>Scenario: <ul> <li>Have a “heartbeat” async method that beats once every ms: <ul> <li>heartbeat delay = 0.001s</li> <li>heartbeat delay = 0.001s</li> <li>…</li> </ul></li> <li>Have a computational amount of work that takes 10ms</li> <li>Need to run a bunch of these computational things (say 200).</li> <li>But starting the heartbeat blocks the asyncio event loop</li> <li>See my example at https://gist.github.com/mikeckennedy/d9ac5a600f91971c6933b4f41a8df480</li> </ul></li> <li><a href="https://github.com/alex-sherman/unsync">Unsync</a> fixes this and improves the code! Here’s my example: https://gist.github.com/mikeckennedy/f23b9b5abd9452cdc8b3bacaf1c3da20</li> <li>Need to limit the number of “active” tasks at a time.</li> <li>Solving it with a job queue: Here’s what does work: a <a href="https://docs.python.org/3/library/asyncio-queue.html">job queue</a>. Create a queue to be populated with coroutines (not tasks), and have a small number of tasks run jobs from the queue.</li> </ul> Brian #4: <a href="https://www.dampfkraft.com/code/how-to-deprecate-a-pypi-package.html">How to Deprecate a PyPI Package</a> <ul> <li>Paul McCann, <a href="https://twitter.com/polm23">@polm23</a></li> <li>A collection of options of how to get people to stop using your package on PyPI. Also includes code samples ore example packages that use some of these methods.</li> <li>Options: <ul> <li>Add deprecation warnings: Useful for parts of your package you want people to stop using, like some of the API, etc.</li> <li>Delete it: Deleting a package or version ok for quick oops mistakes, but allows someone else to grab the name, which is bad. Probably don’t do this.</li> <li>Redirect shim: Add a setup.py shim that just installs a different package. Cool idea, but a bit creepy. </li> <li>Fail during install: Intentionally failing during install and redirecting people to use a different package or just explain why this one is dead. I think I like this the best.</li> </ul></li> </ul> Michael #5: <a href="https://pypi.org/project/enlighten/">Another progress bar library: Enlighten</a> <ul> <li>by Avram Lubkin</li> <li>A few unique features:</li> <li>Multicolored progress bars - It's like many progress bars in one! You could use this in testing, where red is failure, green is success, and yellow is an error. Or maybe when loading something in stages such as loaded, started, connected, and the percentage of the bar for each color changes as the services start up. Has 24-bit color support.</li> <li>Writing to stdout and stderr just works! There are a lot of progress bars. Most of them just print garbage if you write to the terminal when they are running.</li> <li>Automatically handles resizing! (except on Windows)</li> <li>See the animation on the home page.</li> </ul> Brian #6: <a href="https://codeocean.com/">Code Ocean</a> <ul> <li>Contributed by Daniel Mulkey</li> <li>From Daniel “a peer-reviewed journal I read (SPIE's Optical Engineering) has a recommended platform for associating code with your article. It looks like it's focused on reproducibility in science. “</li> <li>Code Ocean is a research collaboration platform that supports researchers from the beginning of a project through publication.</li> <li>This is a paid service, but has a free tier.</li> <li>Supports: <ul> <li>C/C++</li> <li>Fortran</li> <li>Java </li> <li>Julia</li> <li>Lua</li> <li>MATLAB</li> <li>Python (including jupyter) (why is this listed so low? should be at the top!) </li> <li>R</li> <li>Stata</li> </ul></li> <li>From the “About Us” page: <ul> <li>“We built a platform that can help give researchers back 20% of the time they spend troubleshooting technology in order to run and reproduce past work before completing new experiments.”</li> <li>“Code Ocean is an open access platform for code and data where users can develop, share, publish, and download code through a web browser, eliminating the need to install software on personal computers. Our mission is to make computational research easier, more collaborative, and durable.”</li> </ul></li> </ul> Extras: Brian: <ul> <li><a href="https://pythoninsider.blogspot.com/2020/05/python-390b1-is-now-available-for.html">Python 3.9.0b1 is available for testing</a></li> </ul> Michael: <ul> <li>SpaceX launch, lots of Python in action.</li> </ul> Joke: <ul> <li>Sent over by <a href="https://twitter.com/StevenCHowell">Steven Howell</a></li> <li><a href="https://twitter.com/tecmint/status/1260251832905019392">https://twitter.com/tecmint/status/1260251832905019392</a></li> <li>From Bert, <a href="https://twitter.com/schilduil/status/1264869362688765952">https://twitter.com/schilduil/status/1264869362688765952</a></li> <li>But modified for my own experience:</li> <li>“What does pyjokes have in common with Java? It gets updated all the time, but never gets any better.”</li> </ul>

June 05, 2020 08:00 AM UTC

June 04, 2020

PyCharm

PyCharm 2020.2 Early Access Program starts now!

The Early Access Program for our next major release, PyCharm 2020.2, is now open! If you are the kind of person who is always looking forward to the next ‘big thing’, we encourage you to join and share your thoughts on the latest PyCharm improvements! Our upcoming release is loaded with cool features!

pycharm EAP program

If you are not familiar to our EAP programs, here are some ground rules:

We roll out a new build every 2 weeks until the end of July
EAP builds are free to use and expire 30 days after the build date
You can install an EAP build side-by-side with your stable PyCharm version
It’s important to note that these builds are not fully tested and might be unstable
Your feedback is always welcome, use our issue tracker and make sure to mention your build version

Highlighted features

Full support for GitHub Pull Requests from within PyCharm
‘Go to type declaration’ fully available for Python
Automatically added welcome script to new Python projects
Support for the Big Data Tools plugin
Completion for CSS selectors in querySelector methods

These are just a few highlights from what’s coming! For more details check our release notes.

Interested?

Download this EAP from our website. Alternatively, you can use the JetBrains Toolbox App to stay up to date throughout the entire EAP.
If you’re on Ubuntu 16.04 or later, you can use snap to get PyCharm EAP and stay up to date. You can find the installation instructions on our website.

June 04, 2020 09:57 PM UTC

Reuven Lerner

Reminder: Python for non-programmers continues tomorrow!

We’re still going with my live, weekly (free) “Python for non-programmers” course. Our next meeting is tomorrow (June 5th) at 10 a.m. Eastern.

You can join at https://PythonForNonProgrammers.com/ .

If you haven’t yet signed up, it’s not too late! Anyone who signs up gets access to all previous videos, and to our private forum with discussion and homework assignments.

Many participants have previously tried to learn programming, and have come away frustrated… but they’re learning from this course, and enjoying it, too!

So join me (and 1,800 others) in this free, live class, at https://PythonForNonProgrammers.com/ !

The post Reminder: Python for non-programmers continues tomorrow! appeared first on Reuven Lerner.

June 04, 2020 05:50 PM UTC

Anarcat

Replacing Smokeping with Prometheus

I've been struggling with replacing parts of my old sysadmin monitoring toolkit (previously built with Nagios, Munin and Smokeping) with more modern tools (specifically Prometheus, its "exporters" and Grafana) for a while now.

Replacing Munin with Prometheus and Grafana is fairly straightforward: the network architecture ("server pulls metrics from all nodes") is similar and there are lots of exporters. They are a little harder to write than Munin modules, but that makes them more flexible and efficient, which was a huge problem in Munin. I wrote a Migrating from Munin guide that summarizes those differences. Replacing Nagios is much harder, and I still haven't quite figured out if it's worth it.

How does Smokeping work

Leaving those two aside for now, I'm left with Smokeping, which I used in my previous job to diagnose routing issues, using Smokeping as a decentralized looking glass, which was handy to debug long term issues. Smokeping is a strange animal: it's fundamentally similar to Munin, except it's harder to write plugins for it, so most people just use it for Ping, something for which it excels at.

Its trick is this: instead of doing a single ping and returning this metrics, it does multiple ones and returns multiple metrics. Specifically, smokeping will send multiple ICMP packets (20 by default), with a low interval (500ms by default) and a single retry. It also pings multiple hosts at once which means it can quickly scan multiple hosts simultaneously. You therefore see network conditions affecting one host reflected in further hosts down (or up) the chain. The multiple metrics also mean you can draw graphs with "error bars" which Smokeping shows as "smoke" (hence the name). You also get per-metric packet loss.

Basically, smokeping runs this command and collects the output in a RRD database:

fping -c $count -q -b $backoff -r $retry -4 -b $packetsize -t $timeout -i $mininterval -p $hostinterval $host [ $host ...]

... where those parameters are, by default:

$count is 20 (packets)
$backoff is 1 (avoid exponential backoff)
$timeout is 1.5s
$mininterval is 0.01s (minimum wait interval between any target)
$hostinterval is 1.5s (minimum wait between probes on a single target)

It can also override stuff like the source address and TOS fields. This probe will complete between 30 and 60 seconds, if my math is right (0% and 100% packet loss).

How do draw Smokeping graphs in Grafana

A naive implementation of Smokeping in Prometheus/Grafana would be to use the blackbox exporter and create a dashboard displaying those metrics. I've done this at home, and then I realized that I was missing something. Here's what I did.

install the blackbox exporter:

apt install prometheus-blackbox-exporter

make sure to allow capabilities so it can ping:

dpkg-reconfigure prometheus-blackbox-exporter

hook monitoring targets into prometheus.yml (the default blackbox exporter configuration is fine):

scrape_configs:
  - job_name: blackbox
      metrics_path: /probe
      params:
        module: [icmp]
      scrape_interval: 5s
      static_configs:
        - targets:
          - octavia.anarc.at
          # hardcoded in DNS
          - nexthop.anarc.at
          - koumbit.net
          - dns.google
      relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: 127.0.0.1:9115  # The blackbox exporter's real hostname:port.

Notice how we lower the scrape_interval to 5 seconds to get more samples. nexthop.anarc.at was added into DNS to avoid hardcoding my upstream ISP's IP in my configuration.

create a Grafana panel to graph the results. first, add this query:
```
sum(probe_icmp_duration_seconds{phase="rtt"}) by (instance)
```
- Set the Legend field to {{instance}} RTT
- Set Draw modes to lines and Mode options to staircase
- Set the Left Y axis Unit to duration(s)
- Show the Legend As table, with Min, Avg, Max and Current enabled
Then add this query, for packet loss:
```
1-avg_over_time(probe_success[$__interval])!=0 or null
```
- Set the Legend field to {{instance}} packet loss
- Set a Add series override to Lines: false, Null point mode: null, Points: true, Points Radios: 1, Color: deep red, and, most importantly, Y-axis: 2
- Set the Right Y axis Unit to percent (0.0-1.0) and set Y-max to 1
Then set the entire thing to Repeat, on target, vertically. And you need to add a target variable like label_values(probe_success, instance).

The result looks something like this:

A plot of RTT and packet loss over time of three nodes Not bad, but not Smokeping

This actually looks pretty good!

I've uploaded the resulting dashboard in the Grafana dashboard repository.

What is missing?

Now, that doesn't exactly look like Smokeping, does it. It's pretty good, but it's not quite what we want. What is missing is variance, the "smoke" in Smokeping.

There's a good article about replacing Smokeping with Grafana. They wrote a custom script to write samples into InfluxDB so unfortunately we can't use it in this case, since we don't have InfluxDB's query language. I couldn't quite figure out how to do the same in PromQL. I tried:

stddev(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"})
stddev_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[$__interval])
stddev_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[1m])

The first two give zero for all samples. The latter works, but doesn't look as good as Smokeping. So there might be something I'm missing.

SuperQ wrote a special exporter for this called smokeping_prober that came out of this discussion in the blackbox exporter. Instead of delegating scheduling and target definition to Prometheus, the targets are set in the exporter.

They also take a different approach than Smokeping: instead of recording the individual variations, they delegate that to Prometheus, through the use of "buckets". Then they use a query like this:

histogram_quantile(0.9 rate(smokeping_response_duration_seconds_bucket[$__interval]))

This is the rationale to SuperQ's implementation:

Yes, I know about smokeping's bursts of pings. IMO, smokeping's data model is flawed that way. This is where I intentionally deviated from the smokeping exact way of doing things. This prober sends a smooth, regular series of packets in order to be measuring at regular controlled intervals.

Instead of 20 packets, over 10 seconds, every minute. You send one packet per second and scrape every 15. This has the same overall effect, but the measurement is, IMO, more accurate, as it's a continuous stream. There's no 50 second gap of no metrics about the ICMP stream.

Also, you don't get back one metric for those 20 packets, you get several. Min, Max, Avg, StdDev. With the histogram data, you can calculate much more than just that using the raw data.

For example, IMO, avg and max are not all that useful for continuous stream monitoring. What I really want to know is the 90th percentile or 99th percentile.

This smokeping prober is not intended to be a one-to-one replacement for exactly smokeping's real implementation. But simply provide similar functionality, using the power of Prometheus and PromQL to make it better.

[...]

one of the reason I prefer the histogram datatype, is you can use the heatmap panel type in Grafana, which is superior to the individual min/max/avg/stddev metrics that come from smokeping.

Say you had two routes, one slow and one fast. And some pings are sent over one and not the other. Rather than see a wide min/max equaling a wide stddev, the heatmap would show a "line" for both routes.

That's an interesting point. I have also ended up adding a heatmap graph to my dashboard, independently. And it is true it shows those "lines" much better... So maybe that, if we ignore legacy, we're actually happy with what we get, even with the plain blackbox exporter.

So yes, we're missing pretty "fuzz" lines around the main lines, but maybe that's alright. It would be possible to do the equivalent to the InfluxDB hack, with queries like:

min_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[30s])
avg_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[5m])
max_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[30s])

The output looks something like this:

A plot of RTT and packet loss over time of three nodes, with minimax Looks more like Smokeping!

But there's a problem there: see how the middle graph "dips" sometimes below 20ms? That's the min_over_time function (incorrectly, IMHO) returning zero. I haven't quite figured out how to fix that, and I'm not sure it is better. But it does look more like Smokeping than the previous graph.

Update: I forgot to mention one big thing that this setup is missing. Smokeping has this nice feature that you can order and group probe targets in a "folder"-like hierarchy. It is often used to group probes by location, which makes it easier to scan a lot of targets. This is harder to do in this setup. It might be possible to setup location-specific "jobs" and select based on that, but it's not exactly the same.

Credits

Credits to Chris Siebenmann for his article about Prometheus and pings which gave me the avg_over_time query idea.

June 04, 2020 04:00 PM UTC

ListenData

How to drop one or multiple columns from Pandas Dataframe

In this tutorial, we will cover how to drop or remove one or multiple columns from pandas dataframe.

What is pandas in Python?

pandas is a python package for data manipulation. It has several functions for the following data tasks:

Drop or Keep rows and columns
Aggregate data by one or more columns
Sort or reorder data
Merge or append multiple dataframes
String Functions to handle text data
DateTime Functions to handle date or time format columns

Import or Load Pandas library

To make use of any python library, we first need to load them up by using import command.

import pandas as pd
import numpy as np

Let's create a fake dataframe for illustration

The code below creates 4 columns named A through D.

df = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'))

          A         B         C         D
0 -1.236438 -1.656038  1.655995 -1.413243
1  0.507747  0.710933 -1.335381  0.832619
2  0.280036 -0.411327  0.098119  0.768447
3  0.858730 -0.093217  1.077528  0.196891
4 -0.905991  0.302687  0.125881 -0.665159
5 -2.012745 -0.692847 -1.463154 -0.707779

Drop a column in python

In pandas, drop( ) function is used to remove column(s).axis=1 tells Python that you want to apply function on columns instead of rows.

df.drop(['A'], axis=1)

Column A has been removed. See the output shown below.

          B         C         D
0 -1.656038  1.655995 -1.413243
1  0.710933 -1.335381  0.832619
2 -0.411327  0.098119  0.768447
3 -0.093217  1.077528  0.196891
4  0.302687  0.125881 -0.665159
5 -0.692847 -1.463154 -0.707779

June 04, 2020 10:50 AM UTC

Codementor

Python App Development: Perfect Web Framework choice for Startups

Python app development is perfect for startups to build web applications. Python web programming saves cost & time, giving more time-to-market the application.

June 04, 2020 10:23 AM UTC

Wing Tips

Configuring Wing Pro's Python Debugger for Your Code Base

This Wing Tip provides a roadmap to the configuration options available for Wing's debugger, to make it easier to understand the available possibilities and how these can be applied to your development projects.

Configuration Options

Broadly speaking there are five ways to configure Wing's debugger, depending on whether your code runs locally or on a remote system, and whether it is launched from the IDE or from the outside:

Local Stand-Alone Code -- Wing can debug stand-alone scripts and applications that run on your local machine and that are launched on demand from within Wing. This development approach can be used for anything that is convenient to launch from the IDE, including scripts, desktop apps, and most web frameworks. See the Debugger Quick-Start for a quick introduction to this simple case.

Remote Stand-Alone Code -- Wing Pro can also debug stand-alone code running on a remote host, virtual machine or device, in the same way as it debugs locally running code. Wing uses a remote agent launched by SSH in order to work directly with files stored on the remote host, as if Wing were itself running on that system. For details, see Remote Development with Wing Pro.

Local Externally Launched or Embedded Code -- Wing can debug locally running code that is launched by a web server or framework, embedded Python code that is used to script a larger application, and any other Python code that cannot be directly launched from the IDE. In this case, the code is started from outside of Wing and connects to the IDE by importing Wing's debugger. Debug can be controlled from the IDE and through an API accessible from the debug process. For details, see Debugging Externally Launched Code

Remote Externally Launched or Embedded Code -- Wing Pro can also debug externally launched or embedded code that is running on a remote system. In this case, Wing uses a remote agent to access the remote host via SSH and the debugged code imports Wing's debugger in order to connect back to the IDE through an automatically established reverse SSH tunnel. See Debugging Externally Launched Remote Code for brief instructions or Remote Web Development for a more detailed guide.

Manually Configured Remote Debugging -- For remote hosts and devices that are not accessible through SSH, or where Wing's remote agent cannot be run, Wing provides a manual configuration option to make debugging on these systems possible. In this case, the device must be able to connect to the host where Wing is running via TCP/IP, and there must be some file sharing configuration so files are available both locally and on the remote system. In this approach, connectivity, file sharing, and other configuration needed to make debugging possible is accomplished entirely manually, so it can be tailored to unusual custom environments. For details, see Manually Configured Remote Debugging.

Coming Soon in Wing 8

Although not yet available, it's worth mentioning another type of debug configuration that is coming soon:

Containerized Code -- Wing Pro 8 will be able to debug code running in containers like those provided by Docker, without requiring access to the container through SSH and without labor-intensive manual remote debug configuration. In this model, the IDE works with the local files that are used to build the container, and launches code for unit tests and debug in the container environment. This capability should be available fairly soon through our early access program.

VirtualEnv and Anaconda Environments

In the context of each of the above, Wing may be used with or without an environment created by virtualenv or Anaconda's conda create. For local debugging, this is selected by the Python Executable in Project Properties. For remote debugging, it is set in the remote host configuration.

For virtualenv, you can either set the Python Executable to Command Line and enter the full path to the virtualenv's Python, or you can select Activated Env and enter the command that activates the virtual environment.

For Anaconda environments, you must select Activated Env and then choose the environment from the drop down list to the right of this field.

For more information on this, please see Using Wing with virtualenv and Using Wing with Anaconda.

Specific Frameworks and Tools

Some frameworks and tools require some additional custom configuration to make them easy to work with in Wing. In addition to understanding the general options explained above, it is a good idea to seek out configuration details for the frameworks and tools that you use:

Wing's documentation contains configuration instructions for specific frameworks and tools such as Flask, Django, Jupyter, wxPython, PyQt, Blender, Maya, and others.
There is also additional information available for specific kinds of remote development for AWS, Docker, Vagrant, Windows Subsystem for Linux, Raspberry Pi, and others.

The New Project dialog accessed from the Project menu provides some assistance for setting up new Wing projects for most of these.

Multi-threaded and Multi-process Debugging

Wing automatically debugs any multi-threaded code without any additional configuration.

Multi-process code can also be debugged but requires turning on the Debug/Execute > Debug Child Processes option in Project Properties before child processes are automatically debugged. In this case, you may also want to configure specific options for how Wing handles and terminates child processes. See Multi-Process Debugging for details.

That's it for now! We'll be back soon with more Wing Tips for Wing Python IDE.

As always, please don't hesitate to email support@wingware.com if you run into problems or have any questions.

June 04, 2020 01:00 AM UTC

Matt Layman

Designing A View - Building SaaS #59

In this episode, I focused on a single view for adding a course to a school year. This view is reusing a form class from a different part of the app and sharing a template. We worked through the details of making a clear CreateView. The stream started with me answering a question about how I design a new feature. I outlined all the things that I think through for the different kinds of features that I need to build.

June 04, 2020 12:00 AM UTC

June 03, 2020

PyCharm

PyCharm 2020.1.2

PyCharm 2020.1.2 is out now with fixes that will improve your software development experience. Update from within PyCharm (Help | Check for Updates), using the JetBrains Toolbox, or by downloading the new version from our website.

In this version of PyCharm:

The new action ‘Rescan Available Python Modules and Packages’ was added
We added support for Coverage 5.0+
We fixed a bug that made the cursor jump to the __call__ method of the metaclass instead of the class declaration when using ‘go to declaration’ on classes.
We fixed a bug that triggered a false positive inspection “Unexpected argument” for Python 3 enum.Enum() functional constructor.
We fixed a bug that now makes CSS/SCSS formatter aware of CSS3 grid-layout properties.
we fixed a bug that made the data area on the ‘Data View’ window is too small after window shrink and expand.
We fixed a bug that was freezing the editor when displaying documentation for some Matplotlib symbols.

And many more small fixes, see our release notes for details.

Getting the New Version

You can update PyCharm by choosing Help | Check for Updates (or PyCharm | Check for Updates on macOS) in the IDE. PyCharm will be able to patch itself to the new version, there should no longer be a need to run the full installer.
If you’re on Ubuntu 16.04 or later, or any other Linux distribution that supports snap, you should not need to upgrade manually, you’ll automatically receive the new version.

June 03, 2020 04:56 PM UTC

Stack Abuse

Binary Search in Python

Introduction

In this article, we'll be diving into the idea behind and Python implementation of Binary Search.

Binary Search is an efficient search algorithm that works on sorted arrays. It's often used as one of the first examples of algorithms that run in logarithmic time (O(logn)) because of its intuitive behavior, and is a fundamental algorithm in Computer Science.

Binary Search - Example

Binary Search works on a divide-and-conquer approach and relies on the fact that the array is sorted to eliminate half of possible candidates in each iteration. More specifically, it compares the middle element of the sorted array to the element it's searching for in order to decide where to continue the search.

If the target element is larger than the middle element - it can't be located in the first half of the collection so it's discarded. The same goes the other way around.

Note: If the array has an even number of elements, it doesn't matter which of the two "middle" elements we use to begin with.

Let's look at an example quickly before we continue to explain how binary search works:

As we can see, we know for sure that, since the array is sorted, x is not in the first half of the original array.

When we know in which half of the original array x is, we can repeat this exact process with that half and split it into halves again, discarding the half that surely doesn't contain x:

We repeat this process until we end up with a subarray that contains only one element. We check whether that element is x. If it is - we found x, if it isn't - x doesn't exist in the array at all.

If you take a closer look at this, you can notice that in the worst-case scenario (x not existing in the array), we need to check a much smaller number of elements than we'd need to in an unsorted array - which would require something more along the lines of Linear Search, which is insanely inefficient.

To be more precise, the number of elements we need to check in the worst case is log₂N where N is the number of elements in the array.

This makes a bigger impact the bigger the array is:

If our array had 10 elements, we would need to check only 3 elements to either find x or conclude it's not there. That's 33.3%.

However, if our array had 10,000,000 elements we would only need to check 24 elements. That's 0.0002%.

Binary Search Implementation

Binary Search is a naturally recursive algorithm, since the same process is repeated on smaller and smaller arrays until an array of size 1 has been found. However, there is of course an iterative implementation as well, and we will be showing both approaches.

Recursive

Let's start off with the recursive implementation as it's more natural:

def binary_search_recursive(array, element, start, end):
    if start > end:
        return -1

    mid = (start + end) // 2
    if element == array[mid]:
        return mid

    if element < array[mid]:
        return binary_search_recursive(array, element, start, mid-1)
    else:
        return binary_search_recursive(array, element, mid+1, end)

Let's take a closer look at this code. We exit the recursion if the start element is higher than the end element:

if start > end:
        return -1

This is because this situation occurs only when the element doesn't exist in the array. What happens is that we end up with only one element in the current sub-array, and that element doesn't match the one we're looking for.

At this point, start is equal to end. However, since element isn't equal to array[mid], we "split" the array again in such a way that we either decrease end by 1, or increase start by one, and the recursion exists on that condition.

We could have done this using a different approach:

if len(array) == 1:
    if element == array[mid]:
        return mid
    else:
        return -1

The rest of the code does the "check middle element, continue search in the appropriate half of the array" logic. We find the index of the middle element and check whether the element we're searching for matches it:

mid = (start + end) // 2
if elem == array[mid]:
    return mid

If it doesn't, we check whether the element is smaller or larger than the middle element:

if element < array[mid]:
    # Continue the search in the left half
    return binary_search_recursive(array, element, start, mid-1)
else:
    # Continue the search in the right half
    return binary_search_recursive(array, element, mid+1, end)

Let's go ahead and run this algorithm, with a slight modification so that it prints out which subarray it's working on currently:

element = 18
array = [1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]

print("Searching for {}".format(element))
print("Index of {}: {}".format(element, binary_search_recursive(array, element, 0, len(array))))

Running this code will result in:

Searching for 18
Subarray in step 0:[1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]
Subarray in step 1:[16, 18, 24, 28, 29]
Subarray in step 2:[16, 18]
Subarray in step 3:[18]
Index of 18: 7

It's clear to see how it halves the search space in each iteration, getting closer and closer to the element we're looking for. If we tried searching for an element that doesn't exist in the array, the output would be:

Searching for 20
Subarray in step 0: [4, 14, 16, 17, 19, 21, 24, 28, 30, 35, 36, 38, 39, 40, 41, 43]
Subarray in step 1: [4, 14, 16, 17, 19, 21, 24, 28]
Subarray in step 2: [19, 21, 24, 28]
Subarray in step 3: [19]
Index of 20: -1

And just for the fun of it, we can try searching some large arrays and seeing how many steps it takes Binary Search to figure out whether a number exists:

Searching for 421, in an array with 200 elements
Search finished in 6 steps. Index of 421: 169

Searching for 1800, in an array with 1500 elements
Search finished in 11 steps. Index of 1800: -1

Searching for 3101, in an array with 3000 elements
Search finished in 8 steps. Index of 3101: 1551

Iterative

The iterative approach is very simple and similar to the recursive approach. Here, we just perform the checks in a while loop:

def binary_search_iterative(array, element):
    mid = 0
    start = 0
    end = len(array)
    step = 0

    while (start <= end):
        print("Subarray in step {}: {}".format(step, str(array[start:end+1])))
        step = step+1
        mid = (start + end) // 2

        if element == array[mid]:
            return mid

        if element < array[mid]:
            end = mid - 1
        else:
            start = mid + 1
    return -1

Let's populate an array and search for an element within it:

array = [1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]       

print("Searching for {} in {}".format(element, array))
print("Index of {}: {}".format(element, binary_search_iterative(array, element)))

Running this code gives us the output of:

Searching for 18 in [1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]
Subarray in step 0: [1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]
Subarray in step 1: [16, 18, 24, 28, 29]
Subarray in step 2: [16, 18]
Subarray in step 3: [18]
Index of 18: 7

Conclusion

Binary Search is an incredible algorithm to use on large, sorted arrays, or whenever we plan to search for elements repeatedly in a single array.

The cost of sorting the array once and then using Binary Search to find elements in it multiple times is far better than using Linear Search on an unsorted array just so we could avoid the cost of sorting it.

If we're sorting the array and searching for an element just once, it's more efficient to just do a Linear Search on the unsorted array.

If you'd like to read about Sorting Algorithms in Python, we've got you covered!

June 03, 2020 04:45 PM UTC

PyCharm

Introducing the PyCharm Guide

Want to be a badass at Python coding with PyCharm? Keep reading!

Over the last few years we have been collecting productivity tips, tutorials, and a lot more into a central, video-oriented resource, and now we are ready to introduce you to our brand new PyCharm Guide!

The PyCharm Guide currently houses tips, tutorials, and playlists organized by technology and topic. Many of the tips have an in-depth section with a narrated video. We currently have one tutorial and many more on the way with tutorial steps showing full videos and writeups with working code.

The Guide is made for the community, so we intend it to be an open-source project. It’s in GitHub, with one external contribution already, and a relatively-easy Markdown-based contribution format. We also have Guides from other products, both deployed and on the way.

The PyCharm Guide is a shift from publishing bunches of individual pieces to telling stories, over time, across platforms. It is closely linked to our new PyCharm YouTube channel and our Twitter account, as well as this blog, to weave together our stories and storytelling.

More to follow in the coming weeks, especially on the way to the Python Web Conference and our PyCharm tutorial there. If you have any suggestions on how we can rethink advocacy, education, and storytelling, drop a comment!

June 03, 2020 03:53 PM UTC

Real Python

Regular Expressions: Regexes in Python (Part 2)

In the previous tutorial in this series, you covered a lot of ground. You saw how to use re.search() to perform pattern matching with regexes in Python and learned about the many regex metacharacters and parsing flags that you can use to fine-tune your pattern-matching capabilities.

But as great as all that is, the re module has much more to offer.

In this tutorial, you’ll:

Explore more functions, beyond re.search(), that the re module provides
Learn when and how to precompile a regex in Python into a regular expression object
Discover useful things that you can do with the match object returned by the functions in the re module

Ready? Let’s dig in!

Free Bonus: Get a sample chapter from Python Basics: A Practical Introduction to Python 3 to see how you can go from beginner to intermediate in Python with a complete curriculum, up-to-date for Python 3.8.

`re` Module Functions

In addition to re.search(), the re module contains several other functions to help you perform regex-related tasks.

Note: You saw in the previous tutorial that re.search() can take an optional <flags> argument, which specifies flags that modify parsing behavior. All the functions shown below, with the exception of re.escape(), support the <flags> argument in the same way.

You can specify <flags> as either a positional argument or a keyword argument:

re.search(<regex>, <string>, <flags>)
re.search(<regex>, <string>, flags=<flags>)

The default for <flags> is always 0, which indicates no special modification of matching behavior. Remember from the discussion of flags in the previous tutorial that the re.UNICODE flag is always set by default.

The available regex functions in the Python re module fall into the following three categories:

Searching functions
Substitution functions
Utility functions

The following sections explain these functions in more detail.

Searching Functions

Searching functions scan a search string for one or more matches of the specified regex:

Function	Description
`re.search()`	Scans a string for a regex match
`re.match()`	Looks for a regex match at the beginning of a string
`re.fullmatch()`	Looks for a regex match on an entire string
`re.findall()`	Returns a list of all regex matches in a string
`re.finditer()`	Returns an iterator that yields regex matches from a string

As you can see from the table, these functions are similar to one another. But each one tweaks the searching functionality in its own way.

re.search(<regex>, <string>, flags=0)

Scans a string for a regex match.

If you worked through the previous tutorial in this series, then you should be well familiar with this function by now. re.search(<regex>, <string>) looks for any location in <string> where <regex> matches:

>>>

>>> re.search(r'(\d+)', 'foo123bar')
<_sre.SRE_Match object; span=(3, 6), match='123'>
>>> re.search(r'[a-z]+', '123FOO456', flags=re.IGNORECASE)
<_sre.SRE_Match object; span=(3, 6), match='FOO'>

>>> print(re.search(r'\d+', 'foo.bar'))
None

The function returns a match object if it finds a match and None otherwise.

re.match(<regex>, <string>, flags=0)

Looks for a regex match at the beginning of a string.

Read the full article at https://realpython.com/regex-python-part-2/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 03, 2020 02:00 PM UTC

CubicWeb

Report of June 3rd Cubicweb Meeting

Hi everyrone,

Version 3.28-rc1 is on its way! First, let's have a look to the issue board state.

Milestone update

Introduced types #10
- logilab.common.deprecation has been typed (see hackathon report below): done
Add tests for the content negociation !20: MR about to be accepted
Update logilab-common changelogs #43 : done
Add automatic doc re-build to the CubicWeb CI #8 : done

Todo

Review and accept MR !20
Release logilab-common and cubicweb 3.28-rc1

Semver discussions

Right now, dependencies are only specifying a minimal version. So if we introduce a breaking change in a new version, apps might break too. We plan to follow semver convention to prevent this from happening.

We also discussed the idea of aligning version between compatible tools, so every major version would work with the same major version of other tools/dependencies.

This idea will be introduced in 3.29 documentation, but will probably start with the release of Cubicweb version 4.

Hackathon

Last Friday we did an internal hackathon at Logilab and Laurent, Noé and I spent time working on Cubicweb. We mainly:

wrote changelogs for:
- logilab-common
- cubicweb
tried to add a Merge Request template on Cubicweb
- doesn't work on Heptapod actually, we will ask Octobus to have a look (see #46)
added annotation types on logilab.common.deprecated
improved tox.ini and added a gitlag-ci.yaml file in cube skeleton

That's all! You should receive and email soon about the rc1 release.

Thanks for reading,

Henri

June 03, 2020 01:39 PM UTC

Zato Blog

A dark theme for auto-generated API documentation

Starting with version 3.2, Zato will use a new, dark theme for its auto-generated API documentation and specifications. Here is its preview.

An API service and its documentation

Suppose we have a Zato service with I/O as defined below ..

# -*- coding: utf-8 -*-

# Zato
from zato.server.service import Service

class Login(Service):
    """ Logs a user in.

    - Sets session to expire in 24 hours if unused

    - Metadata stored for each session:
      - Creation time in UTC
      - Remote IP and fully-qualified name
    """
    name = 'my.api.login'

    class SimpleIO:
        """
        * user_name - User name to log in as
        * password - User password

        * token - Session token
        * user_id - User's ID
        * user_display_name - User's display name, e.g. first and last name
        * user_email - User's email
        * preferred_language - User's preferred language in the system
        """
        input_required = 'user_name', 'password'
        output_required = 'token', 'user_id', 'user_display_name', 'user_email'
        output_optional = 'preferred_language'

    def handle(self):
        # Skip actual implementation
        pass

.. here is how its documentation will look like with the dark theme as generated by zato apispec.

Note that, as previously, the quick access WSDL and OpenAPI links are in the downloads section, to the left hand side.

Coming up next

Yet, there is more. The new theme is but a part of a series of works focused on API documentation and specifications. Coming up next are:

A service invoker to execute services directly from their documentation
A mobile version
Searchable API specications

Stay tuned for more news.

June 03, 2020 11:40 AM UTC

Django Weblog

Django security releases issued: 3.0.7 and 2.2.13

In accordance with our security release policy, the Django team is issuing Django 3.0.7 and Django 2.2.13. These releases address the security issue detailed below. We encourage all users of Django to upgrade as soon as possible.

CVE-2020-13254: Potential data leakage via malformed memcached keys

In cases where a memcached backend does not perform key validation, passing malformed cache keys could result in a key collision, and potential data leakage. In order to avoid this vulnerability, key validation is added to the memcached cache backends.

Thank you to Dan Palmer for the report and patch.

CVE-2020-13596: Possible XSS via admin `ForeignKeyRawIdWidget`

Query parameters for the admin ForeignKeyRawIdWidget were not properly URL encoded, posing an XSS attack vector. ForeignKeyRawIdWidget now ensures query parameters are correctly URL encoded.

Thank you to Jon Dufresne for the report and patch.

Affected supported versions

Django master branch
Django 3.1 (currently at alpha status)
Django 3.0
Django 2.2

Resolution

Patches to resolve the issue have been applied to Django's master branch and the 3.1, 3.0, and 2.2 release branches. The patches may be obtained from the following changesets:

CVE-2020-13254:

CVE-2020-13596:

The following releases have been issued:

Django 3.0.7 (download Django 3.0.7 | 3.0.7 checksums)
Django 2.2.13 (download Django 2.2.13 | 2.2.13 checksums)

The PGP key ID used for these releases is Carlton Gibson: E17DF5C82B4F9D00.

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance or the django-developers list. Please see our security policies for further information.

June 03, 2020 09:38 AM UTC

Codementor

My Top 7 Picks on PyCon 2020 Online

My top 7 picks for PyCon 2020 videos that is useful for Python developers, Data Scientist & Educators.

June 03, 2020 06:01 AM UTC

Karim Elghamrawy

How to Setup Python 3 on Windows? (Step-by-Step)

The post How to Setup Python 3 on Windows? (Step-by-Step) appeared first on Afternerd.

June 03, 2020 04:37 AM UTC

Kushal Das

Onion location and Onion names in Tor Browser 9.5

Yesterday the Tor Browser 9.5 was released. I am excited about this release for some user-focused updates.

Onion-Location header

If your webserver provides this one extra header Onion-Location, the Tor Browser will ask the user if they want to visit the onion site itself. The user can even mark to visit every such onion site by default. See it in action here.

To enable this, in Apache, you need a configuration line like below for your website’s configuration.

Onion location demo

Header set Onion-Location "http://your-onion-address.onion%{REQUEST_URI}s"

Remember to enable rewrite module.

For nginx, add the following in your server configuration.

add_header Onion-Location http://<your-onion-address>.onion$request_uri;

URL which we can remember aka onion names

This is the first proof of concept built along with Freedom of the Press Foundation (yes, my team) and HTTPS Everywhere to help people to use simple names for onion addresses. For example, below, you can see that I typed theintercept.securedrop.tor.onion on the browser, and that took us to The Intercept’s SecureDrop address.

Onion name

June 03, 2020 03:07 AM UTC

June 02, 2020

Obey the Testing Goat

Cosmic Python

Folks I've written a new book!

Along with my coauthor Bob, we are proud to release "Architecture Patterns with Python", which you can find out more about at cosmicpython.com.

The cosmic soubriquet is a little joke, Cosmos being the opposite of Chaos in ancient Greek, so we want to propose patterns to minimise chaos in your applications.

But the subtitle of the book is Enabling TDD, DDD, and Event-Driven Microservices, and the TDD part is relevant to this blog, and fans of the Testing Goat. In my two years at MADE and working with Bob, I've refined some of my thinking and some of the ways I approach testing, and I think if I were writing TTDwP again today, I might change the way I present some things.

In brief:

Mocking is not the only way to handle external (I/O et al) dependencies for your unit tests. Other techniques are possible, and often offer major benefits
If you really want to get to a test pyramid (where unit tests outnumber slow/e2e/integration tests by an order of magnitude), then you'll probably need to make some specific design choices around identifying business logic and decoupling it from infrastructure code.
When deciding what kind of unit tests to write, there's a lot to be said for writing them at the highest level of abstraction possible. It gives you more room to refactor later.

If you're curious about those questions, head on over to cosmicpython.com, and let me know what you think!

June 02, 2020 11:00 PM UTC

Jaime Buelta

2nd Edition for Python Automation Cookbook now available!

Good news everyone! There’s a new edition of the Python Automation Cookbook! A great way of improving your Python skills for practical tasks! As the first edition, it’s aimed to people that already know a bit of Python (not necessarily developers). It describes how to automate common tasks. Things like work with different kind of documents, generating graphs, sending emails, text messages… You can check the whole table of contents for more details. It’s written in the cookbook format, so it’s a collection of recipes to read and reference independently. There are... Read More

June 02, 2020 08:58 PM UTC

May	JUN	Jul
	06
2019	2020	2021

Planet Python

June 05, 2020

Introduction

Reading and Writing Excel Files in Python with Pandas

Writing Excel Files Using Pandas

Writing Multiple DataFrames to an Excel File

Reading Excel Files with Pandas

Reading Specific Columns from an Excel File

Conclusion

Interested?

June 04, 2020

Highlighted features

Interested?

How does Smokeping work

How do draw Smokeping graphs in Grafana

What is missing?

Credits

Drop a column in python

Configuration Options

Coming Soon in Wing 8

VirtualEnv and Anaconda Environments

Specific Frameworks and Tools

Multi-threaded and Multi-process Debugging

June 03, 2020

In this version of PyCharm:

Getting the New Version

Introduction

Binary Search - Example

Binary Search Implementation

Recursive

Iterative

Conclusion

re Module Functions

Searching Functions

Milestone update

Todo

Semver discussions

Hackathon

An API service and its documentation

More links

Coming up next

CVE-2020-13254: Potential data leakage via malformed memcached keys

CVE-2020-13596: Possible XSS via admin ForeignKeyRawIdWidget

Affected supported versions

Resolution

General notes regarding security reporting

Onion-Location header

URL which we can remember aka onion names

June 02, 2020

`re` Module Functions

CVE-2020-13596: Possible XSS via admin `ForeignKeyRawIdWidget`