In this quick tip, excerpted from Useful Python, Stuart looks at ways to control the Windows OS with Python.
Key Takeaways
- Python can be used to control Windows OS using the Win32 API, with wrappers available to make coding easier. Python’s winreg module, for example, can interact with the Windows Registry without requiring extra installation.
- The PyWin32 module contains the Win32 Python API, allowing access to the Windows Shell API. This can be used to perform various operations, such as finding the location of the Program Files folder. PyGetWindow is another useful module, enabling control and enumeration of on-screen windows.
- Python’s PyGetWindow module provides functions for controlling and interacting with windows. These include getWindowsWithTitle(), getWindowsAt(), getAllWindows(), getActiveWindow(), and getFocusedWindow(). These functions can be used to identify, manipulate, minimize, maximize, move, resize, and bring windows to the front.
The Windows Registry
Windows is entirely controllable from code using the Win32 API, and Microsoft provides extensive documentation at Microsoft Docs for everything that Windows can programmatically do. All of this is accessible from Python as well, although it can seem a little impenetrable if we’re not already accustomed to the Win32 API’s particular way of working. Fortunately, there are various wrappers for these low-level APIs to make code easier to write for Python programmers.
A simple example is to interact with the Windows Registry. Python actually includes the winreg module for doing this out of the box, so no extra installation is required. For an example, let’s check where the Program Files
folder actually lives:
>>> import winreg
>>> hive = winreg.ConnectRegistry(None, winreg.HKEY_LOCAL_MACHINE)
>>> key = winreg.OpenKey(hive, r"SOFTWARE\Microsoft\Windows\CurrentVersion")
>>> value, type = winreg.QueryValueEx(key, "ProgramFilesDir")
>>> value
'C:\\Program Files'
Raw Strings
In the code above, we’re using “raw strings” to specify the key name:
r"SOFTWARE\Microsoft\Windows\CurrentVersion"
Strings passed to the Win32 API often include the backslash character (\
), because Windows uses it in file paths and registry paths to divide one directory from the next.
However, Python uses a backslash as an escape character to allow adding special, untypeable characters to a string. For example, the Python string "first line\nsecond line"
is a string with a newline character in it, so that the text is spread over two lines. This would conflict with the Windows path character: a file path such as "C:\newdir\myfile.txt"
would have the \n
interpreted as a newline.
Raw strings avert this: prefixing a Python string with r
removes the special meaning of a backslash, so that r"C:\newdir\myfile.txt"
is interpreted as intended. We can see that backslashes are treated specially by the value we get back for the folder location: it’s printed as 'C:\\Program Files'
—with the backslash doubled to remove its special meaning—but this is how Python prints it rather than the actual value. Python could have printed that as r'C:\Program Files'
instead.
The Windows API
Reading the registry (and even more so, writing to it) is the source of a thousand hacks on web pages (many of which are old, shouldn’t be linked to, and use the ancient REGEDT32.EXE
), but it’s better to actually use the API for this. (Raymond Chen has written many long sad stories about why we should use the API and not the registry.) How would we use the Win32 API from Python to work this out?
The Win32 Python API is available in the PyWin32 module, which can be obtained with python -m pip install pywin32
. The documentation for the module is rather sparse, but the core idea is that most of the Windows Shell API (that’s concerned with how the Windows OS is set up) is available in the win32com.shell
package. To find out the location of the Program Files
folder, MSDN shows that we need the SHGetKnownFolderPath function, to which is passed a KNOWNFOLDERID constant and a flag set to 0
. Shell constants are available to Python in win32com.shell.shellcon
(for “shell constants”), which means that finding the Program Files
folder requires just one (admittedly complex) line:
>>> from win32com.shell import shell, shellcon
>>> shell.SHGetKnownFolderPath(shellcon.FOLDERID_ProgramFiles, 0)
"C:\\Program Files"
Digging around in the depths of the Win32 API gives us access to anything we may want to access in Windows (including windows!), but as we’ve seen, it can be quite complicated to find out how to do what we need to, and then to translate that need into Python. Fortunately, there are wrapper libraries for many of the functions commonly used. One good example is PyGetWindow, which allows us to enumerate and control on-screen windows. (It claims to be cross-platform, but it actually only works on Windows. But that’s all we need here.)
We can install PyGetWindow with python -m pip install pygetwindow
, and then list all the windows on screen and manipulate them:
>>> import pygetwindow as gw
>>> allMSEdgeWindows = gw.getWindowsWithTitle("edge")
>>> allMSEdgeWindows
[Win32Window(hWnd=197414), Win32Window(hWnd=524986)]
>>> allMSEdgeWindows[0].title
'pywin32 · PyPI - Microsoft Edge'
>>> allMSEdgeWindows[1].title
'Welcome to Python.org - Microsoft Edge'
Those windows can be controlled. A window object can be minimized and restored, or resized and moved around the screen, and focused and brought to the front:
>>> pythonEdgeWindow = allMSEdgeWindows[1]
>>> pythonEdgeWindow.minimize()
>>> pythonEdgeWindow.restore()
>>> pythonEdgeWindow.size
Size(width=1050, height=708)
>>> pythonEdgeWindow.topleft
Point(x=218, y=5)
>>> pythonEdgeWindow.resizeTo(800, 600)
It’s always worth looking on PyPI for wrapper modules that provide a more convenient API for whatever we’re trying to do with windows or with Windows. But if need be, we have access to the whole Win32 API from Python, and that will let us do anything we can think of.
This article is excerpted from Useful Python, available on SitePoint Premium and from ebook retailers.
Frequently Asked Questions (FAQs) about Controlling Windows with Python
How Can I Install the PyGetWindow Module?
To install the PyGetWindow module, you need to use pip, which is a package installer for Python. Open your command prompt or terminal and type the following command: pip install pygetwindow
. If you have both Python 2 and Python 3 installed, use pip3 install pygetwindow
to ensure you’re installing for the correct version.
What Functions are Available in the PyGetWindow Module?
The PyGetWindow module provides several functions for controlling and interacting with windows. Some of the key functions include getWindowsWithTitle()
, getWindowsAt()
, getAllWindows()
, getActiveWindow()
, and getFocusedWindow()
. Each of these functions serves a unique purpose in identifying and manipulating windows.
How Can I Get a Specific Window Using PyGetWindow?
To get a specific window, you can use the getWindowsWithTitle()
function. This function returns a list of Window objects that have a title matching the string you passed. For example, pygetwindow.getWindowsWithTitle('My Application')
would return all windows with the title ‘My Application’.
How Can I Minimize or Maximize a Window Using PyGetWindow?
PyGetWindow provides the minimize()
and maximize()
methods for Window objects. To use these methods, you first need to get the Window object for the window you want to manipulate. Once you have the Window object, you can call myWindow.minimize()
or myWindow.maximize()
to minimize or maximize the window, respectively.
How Can I Move a Window to a Specific Location on the Screen?
To move a window, you can use the moveTo()
method of a Window object. This method takes two arguments: the x and y coordinates of the new location. For example, myWindow.moveTo(100, 200)
would move the window to the point (100, 200) on the screen.
How Can I Resize a Window Using PyGetWindow?
To resize a window, you can use the resizeTo()
method of a Window object. This method takes two arguments: the new width and height of the window. For example, myWindow.resizeTo(800, 600)
would resize the window to be 800 pixels wide and 600 pixels tall.
How Can I Bring a Window to the Front?
To bring a window to the front, you can use the bringToFront()
method of a Window object. This method doesn’t take any arguments. For example, myWindow.bringToFront()
would bring the window to the front of all other windows.
How Can I Check if a Window is Visible or Not?
To check if a window is visible, you can use the isVisible
attribute of a Window object. This attribute returns a boolean value indicating whether the window is visible or not. For example, myWindow.isVisible
would return True if the window is visible, and False otherwise.
How Can I Close a Window Using PyGetWindow?
PyGetWindow doesn’t provide a direct method to close windows. However, you can use the pyautogui
module in conjunction with PyGetWindow to close a window. First, bring the window to the front using myWindow.bringToFront()
, then use pyautogui.hotkey('alt', 'f4')
to send the Alt+F4 command to the window, which will close it.
Can I Use PyGetWindow with Other GUI Automation Libraries?
Yes, PyGetWindow can be used in conjunction with other GUI automation libraries like pyautogui
and pynput
. These libraries provide additional functionality for controlling the mouse and keyboard, which can be useful for automating tasks that involve interacting with windows.
Stuart is a consultant CTO, software architect, and developer to startups and small firms on strategy, custom development, and how to best work with the dev team. Code and writings are to be found at kryogenix.org and social networks; Stuart himself is mostly to be found playing D&D or looking for the best vodka Collins in town.