Planet Python
Last update: June 06, 2020 04:47 AM UTC
June 05, 2020
Stack Abuse
Reading and Writing Excel (XLSX) Files in Python with the Pandas Library
Introduction
Just like with all other types of files, you can use the Pandas library to read and write Excel files using Python as well. In this short tutorial, we are going to discuss how to read and write Excel files via DataFrames.
In addition to simple reading and writing, we will also learn how to write multiple DataFrames into an Excel file, how to read specific rows and columns from a spreadsheet, and how to name single and multiple sheets within a file before doing anything.
If you'd like to learn more about other file types, we've got you covered:
- Reading and Writing JSON Files in Python with Pandas
- Reading and Writing CSV Files in Python with Pandas
Reading and Writing Excel Files in Python with Pandas
Naturally, to use Pandas, we first have to install it. The easiest method to install it is via pip.
If you're running Windows:
$ python pip install pandas
If you're using Linux or MacOS:
$ pip install pandas
Note that you may get a ModuleNotFoundError or ImportError error when running the code in this article. For example:
ModuleNotFoundError: No module named 'openpyxl'
If this is the case, then you'll need to install the missing module(s):
$ pip install openpyxl xlsxwriter xlrd
Writing Excel Files Using Pandas
We'll be storing the information we'd like to write to an Excel file in a DataFrame. Using the built-in to_excel() function, we can extract this information into an Excel file.
First, let's import the Pandas module:
import pandas as pd
Now, let's use a dictionary to populate a DataFrame:
df = pd.DataFrame({'States':['California', 'Florida', 'Montana', 'Colorodo', 'Washington', 'Virginia'],
'Capitals':['Sacramento', 'Tallahassee', 'Helena', 'Denver', 'Olympia', 'Richmond'],
'Population':['508529', '193551', '32315', '619968', '52555', '227032']})
The keys in our dictionary will serve as column names. Similarly, the values become the rows containing the information.
Now, we can use the to_excel() function to write the contents to a file. The only argument is the file path:
df.to_excel('./states.xlsx')
Here's the Excel file that was created:

Please note that we are not using any parameters in our example. Therefore, the sheet within the file retains its default name - "Sheet 1". As you can see, our Excel file has an additional column containing numbers. These numbers are the indices for each row, coming straight from the Pandas DataFrame.
We can change the name of our sheet by adding the sheet_name parameter to our to_excel() call:
df.to_excel('./states.xlsx', sheet_name='States')
Similarly, adding the index parameter and setting it to False will remove the index column from the output:
df.to_excel('./states.xlsx', sheet_name='States', index=False)
Now, the Excel file looks like this:

Writing Multiple DataFrames to an Excel File
It is also possible to write multiple dataframes to an Excel file. If you'd like to, you can set a different sheet for each dataframe as well:
income1 = pd.DataFrame({'Names': ['Stephen', 'Camilla', 'Tom'],
'Salary':[100000, 70000, 60000]})
income2 = pd.DataFrame({'Names': ['Pete', 'April', 'Marty'],
'Salary':[120000, 110000, 50000]})
income3 = pd.DataFrame({'Names': ['Victor', 'Victoria', 'Jennifer'],
'Salary':[75000, 90000, 40000]})
income_sheets = {'Group1': income1, 'Group2': income2, 'Group3': income3}
writer = pd.ExcelWriter('./income.xlsx', engine='xlsxwriter')
for sheet_name in income_sheets.keys():
income_sheets[sheet_name].to_excel(writer, sheet_name=sheet_name, index=False)
writer.save()
Here, we've created 3 different dataframes containing various names of employees and their salaries as data. Each of these dataframes is populated by its respective dictionary.
We've combined these three within the income_sheets variable, where each key is the sheet name, and each value is the DataFrame object.
Finally, we've used the xlsxwriter engine to create a writer object. This object is passed to the to_excel() function call.
Before we even write anything, we loop through the keys of income and for each key, write the content to the respective sheet name.
Here is the generated file:

You can see that the Excel file has three different sheets named Group1, Group2, and Group3. Each of these sheets contains names of employees and their salaries with respect to the date in the three different dataframes in our code.
The engine parameter in the to_excel() function is used to specify which underlying module is used by the Pandas library to create the Excel file. In our case, the xlsxwriter module is used as the engine for the ExcelWriter class. Different engines can be specified depending on their respective features.
Depending upon the Python modules installed on your system, the other options for the engine attribute are: openpyxl (for xlsx and xlsm), and xlwt (for xls).
Further details of using the xlsxwriter module with Pandas library are available at the official documentation.
Last but not least, in the code above we have to explicitly save the file using writer.save(), otherwise it won't be persisted on the disk.
Reading Excel Files with Pandas
In contrast to writing DataFrame objects to an Excel file, we can do the opposite by reading Excel files into DataFrames. Packing the contents of an Excel file into a DataFrame is as easy as calling the read_excel() function:
students_grades = pd.read_excel('./grades.xlsx')
students_grades.head()
For this example, we're reading this Excel file.
Here, the only required argument is the path to the Excel file. The contents are read and packed into a DataFrame, which we can then preview via the head() function.
Note: Using this method, although the simplest one, will only read the first sheet.
Let's take a look at the output of the head() function:

Pandas assigns a row label or numeric index to the DataFrame by default when we use the read_excel() function.
We can override the default index by passing one of the columns in Excel file column as the index_col parameter:
students_grades = pd.read_excel('./grades.xlsx', sheet_names='Grades', index_col='Grade')
students_grades.head()
Running this code will result in:

In the example above, we have replaced the default index with the "Grade" column from the Excel file. However, you should only override the default index if you have a column with values that could serve as a better index.
Reading Specific Columns from an Excel File
Reading a file in its entirety is useful, though in many cases, you'd really want to access a certain element. For example, you might want to read the element's value and assign it to a field of an object.
Again, this is done using the read_excel() function, though, we'll be passing the usecols parameter. For example, we can limit the function to only read certain columns. Let's add the parameter so that we read the columns that correspond to the "Student Name", "Grade" and "Marks Obtained" values.
We do this by specifying the numeric index of each column:
cols = [0, 1, 3]
students_grades = pd.read_excel('./grades.xlsx', usecols=cols)
students_grades.head()
Running this code will yield:

As you can see, we are only retrieving the columns specified in the cols list.
Conclusion
We've covered some general usage of the read_excel() and to_excel() functions of the Pandas library. With them, we've read existing Excel files and written our own data to them.
Using various parameters, we can alter the behavior of these functions, allowing us to build customized files, rather than just dumping everything from a DataFrame.
PyCharm
Smart execution of R code
R plugin is announcing some helpful features to track execution of your R code:
1. Execute your R file as a runnable process, job. Jobs are shown in a separate tab in the R console. You can preview the job status (succeeded or failed), the duration of the execution, and the time you launched the job.
When starting a new job, you can specify the way you want to process the results of the job execution. You can restrict copying it, copy to the global environment, or copy it into a separate variable. To preview the results, switch to the Variables pane:
2. Try new ways to quickly import data files. You can now download data from CSV, TSV, or XLS files into your global environment:
Once added, the data can be accessed from your R code.
In this release, we also introduced some stability improvements and enhancements for resolving and autocompleting named arguments.
Interested?
Download PyCharm from our website and install the R plugin. See more details and installation instructions in PyCharm documentation.
Catalin George Festila
Python 3.8.3 : Using the fabric python module - part 001.
The tutorial for today is about fabric python module.You can read about this python module on the official webpage.The team comes with this intro:Fabric is a high level Python (2.7, 3.4+) library designed to execute shell commands remotely over SSH, yielding useful Python objects in return...It builds on top of Invoke (subprocess command execution and command-line features) and Paramiko (SSH
Real Python
The Real Python Podcast – Episode #12: Web Scraping in Python: Tools, Techniques, and Legality
Do you want to get started with web scraping using Python? Are you concerned about the potential legal implications? What are the tools required and what are some of the best practices? This week on the show we have Kimberly Fessel to discuss her excellent tutorial created for PyCon 2020 online titled "It's Officially Legal so Let's Scrape the Web."
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Doug Hellmann
sphinxcontrib-spelling 5.1.0
sphinxcontrib-spelling is a spelling checker for Sphinx-based documentation. It uses PyEnchant to produce a report showing misspelled words. What’s new in 5.1.0? Add an option to show the line containing a misspelling for context (contributed by Huon Wilson)
Python Bytes
#184 Too many ways to wait with await?
<p>Sponsored by <strong>DigitalOcean</strong>: <a href="https://pythonbytes.fm/digitalocean"><strong>pythonbytes.fm/digitalocean</strong></a> - $100 credit for new users to build something awesome.</p> <p><strong>Michael #1:</strong> <a href="https://hynek.me/articles/waiting-in-asyncio/"><strong>Waiting in asyncio</strong></a></p> <ul> <li>by <a href="https://hynek.me/">Hynek Schlawack</a></li> <li>One of the main appeals of using Python’s <code>asyncio</code> is being able to fire off many coroutines and run them concurrently. How many ways do you know for waiting for their results?</li> <li>The simplest case is to <em>await</em> your coroutines:</li> </ul> <pre><code> result_f = await f() result_g = await g() </code></pre> <ul> <li>Drawbacks: <ol> <li>The coroutines do <strong>not</strong> run concurrently. <code>g</code> only starts executing after <code>f</code> has finished.</li> <li>You can’t cancel them once you started awaiting.</li> </ol></li> <li><code>[asyncio.Task](https://docs.python.org/3/library/asyncio-task.html#asyncio.Task)</code><a href="https://docs.python.org/3/library/asyncio-task.html#asyncio.Task">s</a> wrap your coroutines and get independently <em>scheduled</em> for execution by the event loop whenever you yield control to it</li> </ul> <pre><code> task_f = asyncio.create_task(f()) task_g = asyncio.create_task(g()) await asyncio.sleep(0.1) # <- f() and g() are already running! result_f = await task_f result_g = await task_g </code></pre> <ul> <li>Your tasks now run <em>concurrently</em> and if you decide that you don’t want to wait for <code>task_f</code> or <code>task_g</code> to finish, you can cancel them using <code>task_f.cancel()</code></li> <li><code>[asyncio.gather()](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather)</code> takes 1 or more awaitables as <code>*args</code>, wraps them in tasks if necessary, and waits for all of them to finish. Then it returns the <strong>results</strong> of all awaitables <strong>in the same order</strong></li> </ul> <pre><code> result_f, result_g = await asyncio.gather(f(), g()) </code></pre> <ul> <li><code>[asyncio.wait_for()](https://docs.python.org/3/library/asyncio-task.html#asyncio.wait_for)</code> allows for passing a time out</li> <li>A more elegant approach to timeouts is the <a href="https://pypi.org/project/async-timeout/"><em>async-timeout</em></a> <a href="https://pypi.org/project/async-timeout/">package</a> on PyPI. It gives you an asynchronous context manager that allows you to apply a <em>total</em> timeout even if you need to execute the coroutines <strong>sequentially</strong></li> </ul> <pre><code> async with async_timeout.timeout(5.0): await f() await g() </code></pre> <ul> <li><code>[asyncio.as_completed()](https://docs.python.org/3/library/asyncio-task.html#asyncio.as_completed)</code> takes an iterable of awaitables and returns an iterator that yields <code>[asyncio.Future](https://docs.python.org/3/library/asyncio-future.html#asyncio.Future)</code>s in the order the awaitables are done</li> </ul> <pre><code> for fut in asyncio.as_completed([task_f, task_g], timeout=5.0): try: await fut print("one task down!") except Exception: print("ouch") </code></pre> <ul> <li>Michael’s <a href="http://talkpython.fm/async">Async Python course</a>.</li> </ul> <p><strong>Brian #2:</strong> <strong>virtualenv is faster than venv</strong></p> <ul> <li><a href="https://virtualenv.pypa.io/en/latest/">virtualenv docs</a>: “<code>virtualenv</code> is a tool to create isolated Python environments. Since Python <code>3.3</code>, a subset of it has been integrated into the standard library under the <a href="https://docs.python.org/3/library/venv.html">venv module</a>. The <code>venv</code> module does not offer all features of this library, to name just a few more prominent: <ul> <li>is slower (by not having the <code>app-data</code> seed method),</li> <li>is not as extendable,</li> <li>cannot create virtual environments for arbitrarily installed python versions (and automatically discover these),</li> <li>is not upgrade-able via <a href="https://pip.pypa.io/en/stable/installing/">pip</a>,</li> <li>does not have as rich programmatic API (describe virtual environments without creating them).”</li> </ul></li> <li>pro: faster: under 0.5 seconds vs about 2.5 seconds</li> <li>con: the <code>--prompt</code> is weird. I like the parens and the space, and 3.9’s magic “.” option for prompt to name it after the current directory.</li> <li>pro: the pip you get in your env is already updated</li> <li>conclusion: <ul> <li>I’m on the fence for my own use. Probably leaning more toward keeping built in. But not having to update pip is nice.</li> <li>For teaching, I’ll stick with the built in <code>venv</code>.</li> <li>The “extendable” and “has an API” parts really don’t matter much to me. </li> </ul></li> </ul> <pre><code> $ time python3.9 -m venv venv --prompt . real 0m2.698s user 0m2.055s sys 0m0.606s $ source venv/bin/activate (try) $ deactivate $ rm -fr venv $ time python3.9 -m virtualenv venv --prompt "(try) " ... real 0m0.384s user 0m0.202s sys 0m0.255s $ source venv/bin/activate (try) $ </code></pre> <p><strong>Michael #3:</strong> <a href="https://nullprogram.com/blog/2020/05/24/">Latency in Asynchronous Python</a></p> <ul> <li>Article by Chris Wellons</li> <li>Was debugging a misbehaving Python program that makes significant use of <a href="https://docs.python.org/3/library/asyncio.html">Python’s asyncio</a>.</li> <li>The program would eventually take very long periods of time to respond to network requests.</li> <li>The program’s author had made a couple of fundamental mistakes using asyncio.</li> <li>Scenario: <ul> <li>Have a “heartbeat” async method that beats once every ms: <ul> <li>heartbeat delay = 0.001s</li> <li>heartbeat delay = 0.001s</li> <li>…</li> </ul></li> <li>Have a computational amount of work that takes 10ms</li> <li>Need to run a bunch of these computational things (say 200).</li> <li>But starting the heartbeat blocks the asyncio event loop</li> <li>See my example at https://gist.github.com/mikeckennedy/d9ac5a600f91971c6933b4f41a8df480</li> </ul></li> <li><a href="https://github.com/alex-sherman/unsync">Unsync</a> fixes this and improves the code! Here’s my example: https://gist.github.com/mikeckennedy/f23b9b5abd9452cdc8b3bacaf1c3da20</li> <li>Need to limit the number of “active” tasks at a time.</li> <li><strong>Solving it with a job queue:</strong> Here’s what does work: a <a href="https://docs.python.org/3/library/asyncio-queue.html">job queue</a>. Create a queue to be populated with coroutines (not tasks), and have a small number of tasks run jobs from the queue.</li> </ul> <p><strong>Brian #4:</strong> <a href="https://www.dampfkraft.com/code/how-to-deprecate-a-pypi-package.html"><strong>How to Deprecate a PyPI Package</strong></a></p> <ul> <li>Paul McCann, <a href="https://twitter.com/polm23">@polm23</a></li> <li>A collection of options of how to get people to stop using your package on PyPI. Also includes code samples ore example packages that use some of these methods.</li> <li>Options: <ul> <li><strong>Add deprecation warnings:</strong> Useful for parts of your package you want people to stop using, like some of the API, etc.</li> <li><strong>Delete it:</strong> Deleting a package or version ok for quick oops mistakes, but allows someone else to grab the name, which is bad. Probably don’t do this.</li> <li><strong>Redirect shim:</strong> Add a setup.py shim that just installs a different package. Cool idea, but a bit creepy. </li> <li><strong>Fail during install:</strong> Intentionally failing during install and redirecting people to use a different package or just explain why this one is dead. I think I like this the best.</li> </ul></li> </ul> <p><strong>Michael #5:</strong> <a href="https://pypi.org/project/enlighten/"><strong>Another progress bar library: Enlighten</strong></a></p> <ul> <li>by Avram Lubkin</li> <li>A few unique features:</li> <li><strong>Multicolored progress bars - It's like many progress bars in one!</strong> You could use this in testing, where red is failure, green is success, and yellow is an error. Or maybe when loading something in stages such as loaded, started, connected, and the percentage of the bar for each color changes as the services start up. Has 24-bit color support.</li> <li><strong>Writing to stdout and stderr just works!</strong> There are a lot of progress bars. Most of them just print garbage if you write to the terminal when they are running.</li> <li><strong>Automatically handles resizing! (except on Windows)</strong></li> <li>See the animation on the home page.</li> </ul> <p><strong>Brian #6:</strong> <a href="https://codeocean.com/"><strong>Code Ocean</strong></a></p> <ul> <li>Contributed by Daniel Mulkey</li> <li>From Daniel “a peer-reviewed journal I read (SPIE's Optical Engineering) has a recommended platform for associating code with your article. It looks like it's focused on reproducibility in science. “</li> <li>Code Ocean is a research collaboration platform that supports researchers from the beginning of a project through publication.</li> <li>This is a paid service, but has a free tier.</li> <li>Supports: <ul> <li>C/C++</li> <li>Fortran</li> <li>Java </li> <li>Julia</li> <li>Lua</li> <li>MATLAB</li> <li>Python (including jupyter) (why is this listed so low? should be at the top!) </li> <li>R</li> <li>Stata</li> </ul></li> <li>From the “About Us” page: <ul> <li>“We built a platform that can help give researchers back 20% of the time they spend troubleshooting technology in order to run and reproduce past work before completing new experiments.”</li> <li>“Code Ocean is an open access platform for code and data where users can develop, share, publish, and download code through a web browser, eliminating the need to install software on personal computers. Our mission is to make computational research easier, more collaborative, and durable.”</li> </ul></li> </ul> <p>Extras:</p> <p>Brian:</p> <ul> <li><a href="https://pythoninsider.blogspot.com/2020/05/python-390b1-is-now-available-for.html">Python 3.9.0b1 is available for testing</a></li> </ul> <p>Michael:</p> <ul> <li>SpaceX launch, lots of Python in action.</li> </ul> <p>Joke:</p> <ul> <li>Sent over by <a href="https://twitter.com/StevenCHowell">Steven Howell</a></li> <li><p><a href="https://twitter.com/tecmint/status/1260251832905019392">https://twitter.com/tecmint/status/1260251832905019392</a></p></li> <li><p>From Bert, <a href="https://twitter.com/schilduil/status/1264869362688765952">https://twitter.com/schilduil/status/1264869362688765952</a></p></li> <li>But modified for my own experience:</li> <li>“What does pyjokes have in common with Java? It gets updated all the time, but never gets any better.”</li> </ul>
June 04, 2020
PyCharm
PyCharm 2020.2 Early Access Program starts now!
The Early Access Program for our next major release, PyCharm 2020.2, is now open! If you are the kind of person who is always looking forward to the next ‘big thing’, we encourage you to join and share your thoughts on the latest PyCharm improvements! Our upcoming release is loaded with cool features!

If you are not familiar to our EAP programs, here are some ground rules:
- We roll out a new build every 2 weeks until the end of July
- EAP builds are free to use and expire 30 days after the build date
- You can install an EAP build side-by-side with your stable PyCharm version
- It’s important to note that these builds are not fully tested and might be unstable
- Your feedback is always welcome, use our issue tracker and make sure to mention your build version
Highlighted features
- Full support for GitHub Pull Requests from within PyCharm
- ‘Go to type declaration’ fully available for Python
- Automatically added welcome script to new Python projects
- Support for the Big Data Tools plugin
- Completion for CSS selectors in querySelector methods
These are just a few highlights from what’s coming! For more details check our release notes.
Interested?
Download this EAP from our website. Alternatively, you can use the JetBrains Toolbox App to stay up to date throughout the entire EAP.
If you’re on Ubuntu 16.04 or later, you can use snap to get PyCharm EAP and stay up to date. You can find the installation instructions on our website.
Reuven Lerner
Reminder: Python for non-programmers continues tomorrow!
We’re still going with my live, weekly (free) “Python for non-programmers” course. Our next meeting is tomorrow (June 5th) at 10 a.m. Eastern.
You can join at https://PythonForNonProgrammers.com/ .
If you haven’t yet signed up, it’s not too late! Anyone who signs up gets access to all previous videos, and to our private forum with discussion and homework assignments.
Many participants have previously tried to learn programming, and have come away frustrated… but they’re learning from this course, and enjoying it, too!
So join me (and 1,800 others) in this free, live class, at https://PythonForNonProgrammers.com/ !
The post Reminder: Python for non-programmers continues tomorrow! appeared first on Reuven Lerner.
Anarcat
Replacing Smokeping with Prometheus
I've been struggling with replacing parts of my old sysadmin monitoring toolkit (previously built with Nagios, Munin and Smokeping) with more modern tools (specifically Prometheus, its "exporters" and Grafana) for a while now.
Replacing Munin with Prometheus and Grafana is fairly straightforward: the network architecture ("server pulls metrics from all nodes") is similar and there are lots of exporters. They are a little harder to write than Munin modules, but that makes them more flexible and efficient, which was a huge problem in Munin. I wrote a Migrating from Munin guide that summarizes those differences. Replacing Nagios is much harder, and I still haven't quite figured out if it's worth it.
How does Smokeping work
Leaving those two aside for now, I'm left with Smokeping, which I used in my previous job to diagnose routing issues, using Smokeping as a decentralized looking glass, which was handy to debug long term issues. Smokeping is a strange animal: it's fundamentally similar to Munin, except it's harder to write plugins for it, so most people just use it for Ping, something for which it excels at.
Its trick is this: instead of doing a single ping and returning this metrics, it does multiple ones and returns multiple metrics. Specifically, smokeping will send multiple ICMP packets (20 by default), with a low interval (500ms by default) and a single retry. It also pings multiple hosts at once which means it can quickly scan multiple hosts simultaneously. You therefore see network conditions affecting one host reflected in further hosts down (or up) the chain. The multiple metrics also mean you can draw graphs with "error bars" which Smokeping shows as "smoke" (hence the name). You also get per-metric packet loss.
Basically, smokeping runs this command and collects the output in a RRD database:
fping -c $count -q -b $backoff -r $retry -4 -b $packetsize -t $timeout -i $mininterval -p $hostinterval $host [ $host ...]
... where those parameters are, by default:
$countis 20 (packets)$backoffis 1 (avoid exponential backoff)$timeoutis 1.5s$minintervalis 0.01s (minimum wait interval between any target)$hostintervalis 1.5s (minimum wait between probes on a single target)
It can also override stuff like the source address and TOS fields. This probe will complete between 30 and 60 seconds, if my math is right (0% and 100% packet loss).
How do draw Smokeping graphs in Grafana
A naive implementation of Smokeping in Prometheus/Grafana would be to use the blackbox exporter and create a dashboard displaying those metrics. I've done this at home, and then I realized that I was missing something. Here's what I did.
install the blackbox exporter:
apt install prometheus-blackbox-exportermake sure to allow capabilities so it can ping:
dpkg-reconfigure prometheus-blackbox-exporterhook monitoring targets into
prometheus.yml(the default blackbox exporter configuration is fine):scrape_configs: - job_name: blackbox metrics_path: /probe params: module: [icmp] scrape_interval: 5s static_configs: - targets: - octavia.anarc.at # hardcoded in DNS - nexthop.anarc.at - koumbit.net - dns.google relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: 127.0.0.1:9115 # The blackbox exporter's real hostname:port.Notice how we lower the
scrape_intervalto 5 seconds to get more samples.nexthop.anarc.atwas added into DNS to avoid hardcoding my upstream ISP's IP in my configuration.create a Grafana panel to graph the results. first, add this query:
sum(probe_icmp_duration_seconds{phase="rtt"}) by (instance)- Set the
Legendfield to{{instance}} RTT - Set
Draw modestolinesandMode optionstostaircase - Set the
Left YaxisUnittoduration(s) - Show the
LegendAs table, withMin,Avg,MaxandCurrentenabled
Then add this query, for packet loss:
1-avg_over_time(probe_success[$__interval])!=0 or null- Set the
Legendfield to{{instance}} packet loss - Set a
Add series overridetoLines: false,Null point mode: null,Points: true,Points Radios: 1,Color: deep red, and, most importantly,Y-axis: 2 - Set the
Right YaxisUnittopercent (0.0-1.0)and setY-maxto 1
Then set the entire thing to
Repeat, ontarget,vertically. And you need to add atargetvariable likelabel_values(probe_success, instance).- Set the
The result looks something like this:
Not bad, but not Smokeping
This actually looks pretty good!
I've uploaded the resulting dashboard in the Grafana dashboard repository.
What is missing?
Now, that doesn't exactly look like Smokeping, does it. It's pretty good, but it's not quite what we want. What is missing is variance, the "smoke" in Smokeping.
There's a good article about replacing Smokeping with Grafana. They wrote a custom script to write samples into InfluxDB so unfortunately we can't use it in this case, since we don't have InfluxDB's query language. I couldn't quite figure out how to do the same in PromQL. I tried:
stddev(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"})
stddev_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[$__interval])
stddev_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[1m])
The first two give zero for all samples. The latter works, but doesn't look as good as Smokeping. So there might be something I'm missing.
SuperQ wrote a special exporter for this called smokeping_prober that came out of this discussion in the blackbox exporter. Instead of delegating scheduling and target definition to Prometheus, the targets are set in the exporter.
They also take a different approach than Smokeping: instead of recording the individual variations, they delegate that to Prometheus, through the use of "buckets". Then they use a query like this:
histogram_quantile(0.9 rate(smokeping_response_duration_seconds_bucket[$__interval]))
This is the rationale to SuperQ's implementation:
Yes, I know about smokeping's bursts of pings. IMO, smokeping's data model is flawed that way. This is where I intentionally deviated from the smokeping exact way of doing things. This prober sends a smooth, regular series of packets in order to be measuring at regular controlled intervals.
Instead of 20 packets, over 10 seconds, every minute. You send one packet per second and scrape every 15. This has the same overall effect, but the measurement is, IMO, more accurate, as it's a continuous stream. There's no 50 second gap of no metrics about the ICMP stream.
Also, you don't get back one metric for those 20 packets, you get several. Min, Max, Avg, StdDev. With the histogram data, you can calculate much more than just that using the raw data.
For example, IMO, avg and max are not all that useful for continuous stream monitoring. What I really want to know is the 90th percentile or 99th percentile.
This smokeping prober is not intended to be a one-to-one replacement for exactly smokeping's real implementation. But simply provide similar functionality, using the power of Prometheus and PromQL to make it better.
[...]
one of the reason I prefer the histogram datatype, is you can use the heatmap panel type in Grafana, which is superior to the individual min/max/avg/stddev metrics that come from smokeping.
Say you had two routes, one slow and one fast. And some pings are sent over one and not the other. Rather than see a wide min/max equaling a wide stddev, the heatmap would show a "line" for both routes.
That's an interesting point. I have also ended up adding a heatmap graph to my dashboard, independently. And it is true it shows those "lines" much better... So maybe that, if we ignore legacy, we're actually happy with what we get, even with the plain blackbox exporter.
So yes, we're missing pretty "fuzz" lines around the main lines, but maybe that's alright. It would be possible to do the equivalent to the InfluxDB hack, with queries like:
min_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[30s])
avg_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[5m])
max_over_time(probe_icmp_duration_seconds{phase="rtt",instance=~"$instance"}[30s])
The output looks something like this:
Looks more like Smokeping!
But there's a problem there: see how the middle graph "dips" sometimes
below 20ms? That's the min_over_time function (incorrectly, IMHO)
returning zero. I haven't quite figured out how to fix that, and I'm
not sure it is better. But it does look more like Smokeping than the
previous graph.
Update: I forgot to mention one big thing that this setup is missing. Smokeping has this nice feature that you can order and group probe targets in a "folder"-like hierarchy. It is often used to group probes by location, which makes it easier to scan a lot of targets. This is harder to do in this setup. It might be possible to setup location-specific "jobs" and select based on that, but it's not exactly the same.
Credits
Credits to Chris Siebenmann for his article about Prometheus and
pings which gave me the avg_over_time query idea.
ListenData
How to drop one or multiple columns from Pandas Dataframe
- Drop or Keep rows and columns
- Aggregate data by one or more columns
- Sort or reorder data
- Merge or append multiple dataframes
- String Functions to handle text data
- DateTime Functions to handle date or time format columns
import command. import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'))
A B C D
0 -1.236438 -1.656038 1.655995 -1.413243
1 0.507747 0.710933 -1.335381 0.832619
2 0.280036 -0.411327 0.098119 0.768447
3 0.858730 -0.093217 1.077528 0.196891
4 -0.905991 0.302687 0.125881 -0.665159
5 -2.012745 -0.692847 -1.463154 -0.707779
Drop a column in python
In pandas,drop( ) function is used to remove column(s).axis=1 tells Python that you want to apply function on columns instead of rows. df.drop(['A'], axis=1)Column A has been removed. See the output shown below.
B C D
0 -1.656038 1.655995 -1.413243
1 0.710933 -1.335381 0.832619
2 -0.411327 0.098119 0.768447
3 -0.093217 1.077528 0.196891
4 0.302687 0.125881 -0.665159
5 -0.692847 -1.463154 -0.707779
Codementor
Python App Development: Perfect Web Framework choice for Startups
Python app development is perfect for startups to build web applications. Python web programming saves cost & time, giving more time-to-market the application.
Wing Tips
Configuring Wing Pro's Python Debugger for Your Code Base
This Wing Tip provides a roadmap to the configuration options available for Wing's debugger, to make it easier to understand the available possibilities and how these can be applied to your development projects.
Configuration Options
Broadly speaking there are five ways to configure Wing's debugger, depending on whether your code runs locally or on a remote system, and whether it is launched from the IDE or from the outside:
Local Stand-Alone Code -- Wing can debug stand-alone scripts and applications that run on your local machine and that are launched on demand from within Wing. This development approach can be used for anything that is convenient to launch from the IDE, including scripts, desktop apps, and most web frameworks. See the Debugger Quick-Start for a quick introduction to this simple case.
Remote Stand-Alone Code -- Wing Pro can also debug stand-alone code running on a remote host, virtual machine or device, in the same way as it debugs locally running code. Wing uses a remote agent launched by SSH in order to work directly with files stored on the remote host, as if Wing were itself running on that system. For details, see Remote Development with Wing Pro.
Local Externally Launched or Embedded Code -- Wing can debug locally running code that is launched by a web server or framework, embedded Python code that is used to script a larger application, and any other Python code that cannot be directly launched from the IDE. In this case, the code is started from outside of Wing and connects to the IDE by importing Wing's debugger. Debug can be controlled from the IDE and through an API accessible from the debug process. For details, see Debugging Externally Launched Code
Remote Externally Launched or Embedded Code -- Wing Pro can also debug externally launched or embedded code that is running on a remote system. In this case, Wing uses a remote agent to access the remote host via SSH and the debugged code imports Wing's debugger in order to connect back to the IDE through an automatically established reverse SSH tunnel. See Debugging Externally Launched Remote Code for brief instructions or Remote Web Development for a more detailed guide.
Manually Configured Remote Debugging -- For remote hosts and devices that are not accessible through SSH, or where Wing's remote agent cannot be run, Wing provides a manual configuration option to make debugging on these systems possible. In this case, the device must be able to connect to the host where Wing is running via TCP/IP, and there must be some file sharing configuration so files are available both locally and on the remote system. In this approach, connectivity, file sharing, and other configuration needed to make debugging possible is accomplished entirely manually, so it can be tailored to unusual custom environments. For details, see Manually Configured Remote Debugging.
Coming Soon in Wing 8
Although not yet available, it's worth mentioning another type of debug configuration that is coming soon:
Containerized Code -- Wing Pro 8 will be able to debug code running in containers like those provided by Docker, without requiring access to the container through SSH and without labor-intensive manual remote debug configuration. In this model, the IDE works with the local files that are used to build the container, and launches code for unit tests and debug in the container environment. This capability should be available fairly soon through our early access program.
VirtualEnv and Anaconda Environments
In the context of each of the above, Wing may be used with or without an environment created by virtualenv or Anaconda's conda create. For local debugging, this is selected by the Python Executable in Project Properties. For remote debugging, it is set in the remote host configuration.
For virtualenv, you can either set the Python Executable to Command Line and enter the full path to the virtualenv's Python, or you can select Activated Env and enter the command that activates the virtual environment.
For Anaconda environments, you must select Activated Env and then choose the environment from the drop down list to the right of this field.
For more information on this, please see Using Wing with virtualenv and Using Wing with Anaconda.
Specific Frameworks and Tools
Some frameworks and tools require some additional custom configuration to make them easy to work with in Wing. In addition to understanding the general options explained above, it is a good idea to seek out configuration details for the frameworks and tools that you use:
- Wing's documentation contains configuration instructions for specific frameworks and tools such as Flask, Django, Jupyter, wxPython, PyQt, Blender, Maya, and others.
- There is also additional information available for specific kinds of remote development for AWS, Docker, Vagrant, Windows Subsystem for Linux, Raspberry Pi, and others.
The New Project dialog accessed from the Project menu provides some assistance for setting up new Wing projects for most of these.
Multi-threaded and Multi-process Debugging
Wing automatically debugs any multi-threaded code without any additional configuration.
Multi-process code can also be debugged but requires turning on the Debug/Execute > Debug Child Processes option in Project Properties before child processes are automatically debugged. In this case, you may also want to configure specific options for how Wing handles and terminates child processes. See Multi-Process Debugging for details.
That's it for now! We'll be back soon with more Wing Tips for Wing Python IDE.
As always, please don't hesitate to email support@wingware.com if you run into problems or have any questions.
Matt Layman
Designing A View - Building SaaS #59
In this episode, I focused on a single view for adding a course to a school year. This view is reusing a form class from a different part of the app and sharing a template. We worked through the details of making a clear CreateView. The stream started with me answering a question about how I design a new feature. I outlined all the things that I think through for the different kinds of features that I need to build.
June 03, 2020
PyCharm
PyCharm 2020.1.2
PyCharm 2020.1.2 is out now with fixes that will improve your software development experience. Update from within PyCharm (Help | Check for Updates), using the JetBrains Toolbox, or by downloading the new version from our website.
In this version of PyCharm:
- The new action ‘Rescan Available Python Modules and Packages’ was added
- We added support for Coverage 5.0+
- We fixed a bug that made the cursor jump to the __call__ method of the metaclass instead of the class declaration when using ‘go to declaration’ on classes.
- We fixed a bug that triggered a false positive inspection “Unexpected argument” for Python 3 enum.Enum() functional constructor.
- We fixed a bug that now makes CSS/SCSS formatter aware of CSS3 grid-layout properties.
- we fixed a bug that made the data area on the ‘Data View’ window is too small after window shrink and expand.
- We fixed a bug that was freezing the editor when displaying documentation for some Matplotlib symbols.
And many more small fixes, see our release notes for details.
Getting the New Version
You can update PyCharm by choosing Help | Check for Updates (or PyCharm | Check for Updates on macOS) in the IDE. PyCharm will be able to patch itself to the new version, there should no longer be a need to run the full installer.
If you’re on Ubuntu 16.04 or later, or any other Linux distribution that supports snap, you should not need to upgrade manually, you’ll automatically receive the new version.
Stack Abuse
Binary Search in Python
Introduction
In this article, we'll be diving into the idea behind and Python implementation of Binary Search.
Binary Search is an efficient search algorithm that works on sorted arrays. It's often used as one of the first examples of algorithms that run in logarithmic time (O(logn)) because of its intuitive behavior, and is a fundamental algorithm in Computer Science.
Binary Search - Example
Binary Search works on a divide-and-conquer approach and relies on the fact that the array is sorted to eliminate half of possible candidates in each iteration. More specifically, it compares the middle element of the sorted array to the element it's searching for in order to decide where to continue the search.
If the target element is larger than the middle element - it can't be located in the first half of the collection so it's discarded. The same goes the other way around.
Note: If the array has an even number of elements, it doesn't matter which of the two "middle" elements we use to begin with.
Let's look at an example quickly before we continue to explain how binary search works:

As we can see, we know for sure that, since the array is sorted, x is not in the first half of the original array.
When we know in which half of the original array x is, we can repeat this exact process with that half and split it into halves again, discarding the half that surely doesn't contain x:

We repeat this process until we end up with a subarray that contains only one element. We check whether that element is x. If it is - we found x, if it isn't - x doesn't exist in the array at all.
If you take a closer look at this, you can notice that in the worst-case scenario (x not existing in the array), we need to check a much smaller number of elements than we'd need to in an unsorted array - which would require something more along the lines of Linear Search, which is insanely inefficient.
To be more precise, the number of elements we need to check in the worst case is log2N where N is the number of elements in the array.
This makes a bigger impact the bigger the array is:
If our array had 10 elements, we would need to check only 3 elements to either find x or conclude it's not there. That's 33.3%.
However, if our array had 10,000,000 elements we would only need to check 24 elements. That's 0.0002%.
Binary Search Implementation
Binary Search is a naturally recursive algorithm, since the same process is repeated on smaller and smaller arrays until an array of size 1 has been found. However, there is of course an iterative implementation as well, and we will be showing both approaches.
Recursive
Let's start off with the recursive implementation as it's more natural:
def binary_search_recursive(array, element, start, end):
if start > end:
return -1
mid = (start + end) // 2
if element == array[mid]:
return mid
if element < array[mid]:
return binary_search_recursive(array, element, start, mid-1)
else:
return binary_search_recursive(array, element, mid+1, end)
Let's take a closer look at this code. We exit the recursion if the start element is higher than the end element:
if start > end:
return -1
This is because this situation occurs only when the element doesn't exist in the array. What happens is that we end up with only one element in the current sub-array, and that element doesn't match the one we're looking for.
At this point, start is equal to end. However, since element isn't equal to array[mid], we "split" the array again in such a way that we either decrease end by 1, or increase start by one, and the recursion exists on that condition.
We could have done this using a different approach:
if len(array) == 1:
if element == array[mid]:
return mid
else:
return -1
The rest of the code does the "check middle element, continue search in the appropriate half of the array" logic. We find the index of the middle element and check whether the element we're searching for matches it:
mid = (start + end) // 2
if elem == array[mid]:
return mid
If it doesn't, we check whether the element is smaller or larger than the middle element:
if element < array[mid]:
# Continue the search in the left half
return binary_search_recursive(array, element, start, mid-1)
else:
# Continue the search in the right half
return binary_search_recursive(array, element, mid+1, end)
Let's go ahead and run this algorithm, with a slight modification so that it prints out which subarray it's working on currently:
element = 18
array = [1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]
print("Searching for {}".format(element))
print("Index of {}: {}".format(element, binary_search_recursive(array, element, 0, len(array))))
Running this code will result in:
Searching for 18
Subarray in step 0:[1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]
Subarray in step 1:[16, 18, 24, 28, 29]
Subarray in step 2:[16, 18]
Subarray in step 3:[18]
Index of 18: 7
It's clear to see how it halves the search space in each iteration, getting closer and closer to the element we're looking for. If we tried searching for an element that doesn't exist in the array, the output would be:
Searching for 20
Subarray in step 0: [4, 14, 16, 17, 19, 21, 24, 28, 30, 35, 36, 38, 39, 40, 41, 43]
Subarray in step 1: [4, 14, 16, 17, 19, 21, 24, 28]
Subarray in step 2: [19, 21, 24, 28]
Subarray in step 3: [19]
Index of 20: -1
And just for the fun of it, we can try searching some large arrays and seeing how many steps it takes Binary Search to figure out whether a number exists:
Searching for 421, in an array with 200 elements
Search finished in 6 steps. Index of 421: 169
Searching for 1800, in an array with 1500 elements
Search finished in 11 steps. Index of 1800: -1
Searching for 3101, in an array with 3000 elements
Search finished in 8 steps. Index of 3101: 1551
Iterative
The iterative approach is very simple and similar to the recursive approach. Here, we just perform the checks in a while loop:
def binary_search_iterative(array, element):
mid = 0
start = 0
end = len(array)
step = 0
while (start <= end):
print("Subarray in step {}: {}".format(step, str(array[start:end+1])))
step = step+1
mid = (start + end) // 2
if element == array[mid]:
return mid
if element < array[mid]:
end = mid - 1
else:
start = mid + 1
return -1
Let's populate an array and search for an element within it:
array = [1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]
print("Searching for {} in {}".format(element, array))
print("Index of {}: {}".format(element, binary_search_iterative(array, element)))
Running this code gives us the output of:
Searching for 18 in [1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]
Subarray in step 0: [1, 2, 5, 7, 13, 15, 16, 18, 24, 28, 29]
Subarray in step 1: [16, 18, 24, 28, 29]
Subarray in step 2: [16, 18]
Subarray in step 3: [18]
Index of 18: 7
Conclusion
Binary Search is an incredible algorithm to use on large, sorted arrays, or whenever we plan to search for elements repeatedly in a single array.
The cost of sorting the array once and then using Binary Search to find elements in it multiple times is far better than using Linear Search on an unsorted array just so we could avoid the cost of sorting it.
If we're sorting the array and searching for an element just once, it's more efficient to just do a Linear Search on the unsorted array.
If you'd like to read about Sorting Algorithms in Python, we've got you covered!
PyCharm
Introducing the PyCharm Guide
Want to be a badass at Python coding with PyCharm? Keep reading!
Over the last few years we have been collecting productivity tips, tutorials, and a lot more into a central, video-oriented resource, and now we are ready to introduce you to our brand new PyCharm Guide!
The PyCharm Guide currently houses tips, tutorials, and playlists organized by technology and topic. Many of the tips have an in-depth section with a narrated video. We currently have one tutorial and many more on the way with tutorial steps showing full videos and writeups with working code.
The Guide is made for the community, so we intend it to be an open-source project. It’s in GitHub, with one external contribution already, and a relatively-easy Markdown-based contribution format. We also have Guides from other products, both deployed and on the way.
The PyCharm Guide is a shift from publishing bunches of individual pieces to telling stories, over time, across platforms. It is closely linked to our new PyCharm YouTube channel and our Twitter account, as well as this blog, to weave together our stories and storytelling.
More to follow in the coming weeks, especially on the way to the Python Web Conference and our PyCharm tutorial there. If you have any suggestions on how we can rethink advocacy, education, and storytelling, drop a comment!
Real Python
Regular Expressions: Regexes in Python (Part 2)
In the previous tutorial in this series, you covered a lot of ground. You saw how to use re.search() to perform pattern matching with regexes in Python and learned about the many regex metacharacters and parsing flags that you can use to fine-tune your pattern-matching capabilities.
But as great as all that is, the re module has much more to offer.
In this tutorial, you’ll:
- Explore more functions, beyond
re.search(), that theremodule provides - Learn when and how to precompile a regex in Python into a regular expression object
- Discover useful things that you can do with the match object returned by the functions in the
remodule
Ready? Let’s dig in!
Free Bonus: Get a sample chapter from Python Basics: A Practical Introduction to Python 3 to see how you can go from beginner to intermediate in Python with a complete curriculum, up-to-date for Python 3.8.
re Module Functions
In addition to re.search(), the re module contains several other functions to help you perform regex-related tasks.
Note: You saw in the previous tutorial that re.search() can take an optional <flags> argument, which specifies flags that modify parsing behavior. All the functions shown below, with the exception of re.escape(), support the <flags> argument in the same way.
You can specify <flags> as either a positional argument or a keyword argument:
re.search(<regex>, <string>, <flags>)
re.search(<regex>, <string>, flags=<flags>)
The default for <flags> is always 0, which indicates no special modification of matching behavior. Remember from the discussion of flags in the previous tutorial that the re.UNICODE flag is always set by default.
The available regex functions in the Python re module fall into the following three categories:
- Searching functions
- Substitution functions
- Utility functions
The following sections explain these functions in more detail.
Searching Functions
Searching functions scan a search string for one or more matches of the specified regex:
| Function | Description |
|---|---|
re.search() |
Scans a string for a regex match |
re.match() |
Looks for a regex match at the beginning of a string |
re.fullmatch() |
Looks for a regex match on an entire string |
re.findall() |
Returns a list of all regex matches in a string |
re.finditer() |
Returns an iterator that yields regex matches from a string |
As you can see from the table, these functions are similar to one another. But each one tweaks the searching functionality in its own way.
re.search(<regex>, <string>, flags=0)
Scans a string for a regex match.
If you worked through the previous tutorial in this series, then you should be well familiar with this function by now. re.search(<regex>, <string>) looks for any location in <string> where <regex> matches:
>>> re.search(r'(\d+)', 'foo123bar')
<_sre.SRE_Match object; span=(3, 6), match='123'>
>>> re.search(r'[a-z]+', '123FOO456', flags=re.IGNORECASE)
<_sre.SRE_Match object; span=(3, 6), match='FOO'>
>>> print(re.search(r'\d+', 'foo.bar'))
None
The function returns a match object if it finds a match and None otherwise.
re.match(<regex>, <string>, flags=0)
Looks for a regex match at the beginning of a string.
Read the full article at https://realpython.com/regex-python-part-2/ »
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
CubicWeb
Report of June 3rd Cubicweb Meeting
Hi everyrone,
Version 3.28-rc1 is on its way! First, let's have a look to the issue board state.
Milestone update
- Introduced types #10
- logilab.common.deprecation has been typed (see hackathon report below): done
- Add tests for the content negociation !20: MR about to be accepted
- Update logilab-common changelogs #43 : done
- Add automatic doc re-build to the CubicWeb CI #8 : done
Todo
- Review and accept MR !20
- Release logilab-common and cubicweb 3.28-rc1
Semver discussions
Right now, dependencies are only specifying a minimal version. So if we introduce a breaking change in a new version, apps might break too. We plan to follow semver convention to prevent this from happening.
We also discussed the idea of aligning version between compatible tools, so every major version would work with the same major version of other tools/dependencies.
This idea will be introduced in 3.29 documentation, but will probably start with the release of Cubicweb version 4.
Hackathon
Last Friday we did an internal hackathon at Logilab and Laurent, Noé and I spent time working on Cubicweb. We mainly:
- wrote changelogs for:
- logilab-common
- cubicweb
- tried to add a Merge Request template on Cubicweb
- doesn't work on Heptapod actually, we will ask Octobus to have a look (see #46)
- added annotation types on logilab.common.deprecated
- improved
tox.iniand added agitlag-ci.yamlfile in cube skeleton
That's all! You should receive and email soon about the rc1 release.
Thanks for reading,
Henri
Zato Blog
A dark theme for auto-generated API documentation
Starting with version 3.2, Zato will use a new, dark theme for its auto-generated API documentation and specifications. Here is its preview.
An API service and its documentation
Suppose we have a Zato service with I/O as defined below ..
# -*- coding: utf-8 -*-
# Zato
from zato.server.service import Service
class Login(Service):
""" Logs a user in.
- Sets session to expire in 24 hours if unused
- Metadata stored for each session:
- Creation time in UTC
- Remote IP and fully-qualified name
"""
name = 'my.api.login'
class SimpleIO:
"""
* user_name - User name to log in as
* password - User password
* token - Session token
* user_id - User's ID
* user_display_name - User's display name, e.g. first and last name
* user_email - User's email
* preferred_language - User's preferred language in the system
"""
input_required = 'user_name', 'password'
output_required = 'token', 'user_id', 'user_display_name', 'user_email'
output_optional = 'preferred_language'
def handle(self):
# Skip actual implementation
pass
.. here is how its documentation will look like with the dark theme as generated by zato apispec.
Note that, as previously, the quick access WSDL and OpenAPI links are in the downloads section, to the left hand side.
More links
To check it out live, here is a link to the actual documentation from the screenshot above. Also, Zato has its own API documentation whose specification you can find here.
Coming up next
Yet, there is more. The new theme is but a part of a series of works focused on API documentation and specifications. Coming up next are:
- A service invoker to execute services directly from their documentation
- A mobile version
- Searchable API specications
Stay tuned for more news.
Django Weblog
Django security releases issued: 3.0.7 and 2.2.13
In accordance with our security release policy, the Django team is issuing Django 3.0.7 and Django 2.2.13. These releases address the security issue detailed below. We encourage all users of Django to upgrade as soon as possible.
CVE-2020-13254: Potential data leakage via malformed memcached keys
In cases where a memcached backend does not perform key validation, passing malformed cache keys could result in a key collision, and potential data leakage. In order to avoid this vulnerability, key validation is added to the memcached cache backends.
Thank you to Dan Palmer for the report and patch.
CVE-2020-13596: Possible XSS via admin ForeignKeyRawIdWidget
Query parameters for the admin ForeignKeyRawIdWidget were not properly URL encoded, posing an XSS attack vector. ForeignKeyRawIdWidget now ensures query parameters are correctly URL encoded.
Thank you to Jon Dufresne for the report and patch.
Affected supported versions
- Django master branch
- Django 3.1 (currently at alpha status)
- Django 3.0
- Django 2.2
Resolution
Patches to resolve the issue have been applied to Django's master branch and the 3.1, 3.0, and 2.2 release branches. The patches may be obtained from the following changesets:
CVE-2020-13254:
- On the master branch
- On the 3.1 release branch
- On the 3.0 release branch
- On the 2.2 release branch
CVE-2020-13596:
- On the master branch
- On the 3.1 release branch
- On the 3.0 release branch
- On the 2.2 release branch
The following releases have been issued:
- Django 3.0.7 (download Django 3.0.7 | 3.0.7 checksums)
- Django 2.2.13 (download Django 2.2.13 | 2.2.13 checksums)
The PGP key ID used for these releases is Carlton Gibson: E17DF5C82B4F9D00.
General notes regarding security reporting
As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance or the django-developers list. Please see our security policies for further information.
Codementor
My Top 7 Picks on PyCon 2020 Online
My top 7 picks for PyCon 2020 videos that is useful for Python developers, Data Scientist & Educators.
Karim Elghamrawy
How to Setup Python 3 on Windows? (Step-by-Step)
The post How to Setup Python 3 on Windows? (Step-by-Step) appeared first on Afternerd.
Kushal Das
Onion location and Onion names in Tor Browser 9.5
Yesterday the Tor Browser 9.5 was released. I am excited about this release for some user-focused updates.
Onion-Location header
If your webserver provides this one extra header Onion-Location, the Tor
Browser will ask the user if they want to visit the onion site itself. The user
can even mark to visit every such onion site by default. See it in action here.
To enable this, in Apache, you need a configuration line like below for your website’s configuration.

Header set Onion-Location "http://your-onion-address.onion%{REQUEST_URI}s"
Remember to enable rewrite module.
For nginx, add the following in your server configuration.
add_header Onion-Location http://<your-onion-address>.onion$request_uri;
URL which we can remember aka onion names
This is the first proof of concept built along with Freedom of the Press
Foundation (yes, my team) and HTTPS
Everywhere to help people to use simple
names for onion addresses. For example, below, you can see that I typed
theintercept.securedrop.tor.onion on the browser, and that took us to The
Intercept’s SecureDrop
address.

June 02, 2020
Obey the Testing Goat
Cosmic Python
Folks I've written a new book!
Along with my coauthor Bob, we are proud to release "Architecture Patterns with Python", which you can find out more about at cosmicpython.com.
The cosmic soubriquet is a little joke, Cosmos being the opposite of Chaos in ancient Greek, so we want to propose patterns to minimise chaos in your applications.
But the subtitle of the book is Enabling TDD, DDD, and Event-Driven Microservices, and the TDD part is relevant to this blog, and fans of the Testing Goat. In my two years at MADE and working with Bob, I've refined some of my thinking and some of the ways I approach testing, and I think if I were writing TTDwP again today, I might change the way I present some things.
In brief:
-
Mocking is not the only way to handle external (I/O et al) dependencies for your unit tests. Other techniques are possible, and often offer major benefits
-
If you really want to get to a test pyramid (where unit tests outnumber slow/e2e/integration tests by an order of magnitude), then you'll probably need to make some specific design choices around identifying business logic and decoupling it from infrastructure code.
-
When deciding what kind of unit tests to write, there's a lot to be said for writing them at the highest level of abstraction possible. It gives you more room to refactor later.
If you're curious about those questions, head on over to cosmicpython.com, and let me know what you think!
Jaime Buelta
2nd Edition for Python Automation Cookbook now available!
Good news everyone! There’s a new edition of the Python Automation Cookbook! A great way of improving your Python skills for practical tasks! As the first edition, it’s aimed to people that already know a bit of Python (not necessarily developers). It describes how to automate common tasks. Things like work with different kind of documents, generating graphs, sending emails, text messages… You can check the whole table of contents for more details. It’s written in the cookbook format, so it’s a collection of recipes to read and reference independently. There are... Read More






