{article Dive into Python}{title} {text} {/article}

Constructing Pathnames

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import os
>>> os.path.join("c:\\music\\ap\\", "mahadeva.mp3")
'c:\\music\\ap\\mahadeva.mp3'
>>> os.path.join("c:\\music\\ap", "mahadeva.mp3")
'c:\\music\\ap\\mahadeva.mp3'
>>> os.path.expanduser("~")
'C:\\Users\\Tom'
>>> os.path.join(os.path.expanduser("~"), "Python")
'C:\\Users\\Tom\\Python'
>>>

os.path is a reference to a module −− which module depends on your platform. Just as getpass encapsulates differences between platforms by setting getpass to a platform−specific function, os encapsulates differences between platforms by setting path to a platform−specific module.

The join function of os.path constructs a pathname out of one or more partial pathnames. In this case, it simply concatenates strings. (Note that dealing with pathnames on Windows is annoying because the backslash character must be escaped.)

In this slightly less trivial case, join will add an extra backslash to the pathname before joining it to the filename. I was overjoyed when I discovered this, since addSlashIfNecessary is one of the stupid little functions I always need to write when building up my toolbox in a new language. Do not write this stupid little function in Python; smart people have already taken care of it for you.

expanduser will expand a pathname that uses ~ to represent the current user's home directory. This works on any platform where users have a home directory, like Windows, UNIX, and Mac OS X; it has no effect on Mac OS.

Combining these techniques, you can easily construct pathnames for directories and files under the user's home directory.

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import os
>>> os.path.join("c:\\music\\ap\\", "mahadeva.mp3")
'c:\\music\\ap\\mahadeva.mp3'
>>> os.path.join("c:\\music\\ap", "mahadeva.mp3")
'c:\\music\\ap\\mahadeva.mp3'
>>> os.path.expanduser("~")
'C:\\Users\\Tom'
>>> os.path.join(os.path.expanduser("~"), "Python")
'C:\\Users\\Tom\\Python'
>>>


</pre>

<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}

Splitting Pathnames

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import os
>>> os.path.split("c:\\music\\ap\\mahadeva.mp3")
('c:\\music\\ap', 'mahadeva.mp3')
>>> (filepath, filename) = os.path.split("c:\\music\\ap\\mahadeva.mp3")
>>> filepath
'c:\\music\\ap'
>>> filename
'mahadeva.mp3'
>>> (shortname, extension) = os.path.splitext(filename)
>>> shortname
'mahadeva'
>>> extension
'.mp3'
>>>

The split function splits a full pathname and returns a tuple containing the path and filename. Remember when I said you could use multi−variable assignment to return multiple values from a function? Well, split is such a function.

You assign the return value of the split function into a tuple of two variables. Each variable receives the value of the corresponding element of the returned tuple.

The first variable, filepath, receives the value of the first element of the tuple returned from split, the file path.

The second variable, filename, receives the value of the second element of the tuple returned from split, the filename.

os.path also contains a function splitext, which splits a filename and returns a tuple containing the filename and the file extension. You use the same technique to assign each of them to separate variables.

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import os
>>> os.path.split("c:\\music\\ap\\mahadeva.mp3")
('c:\\music\\ap', 'mahadeva.mp3')
>>> (filepath, filename) = os.path.split("c:\\music\\ap\\mahadeva.mp3")
>>> filepath
'c:\\music\\ap'
>>> filename
'mahadeva.mp3'
>>> (shortname, extension) = os.path.splitext(filename)
>>> shortname
'mahadeva'
>>> extension
'.mp3'
>>>

</pre>

<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}

Listing Directories

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import os
>>> os.listdir("c:\\music\\_singles\\")
['01 - What I Want For Christmas.mp3', '02 - Fruitcake.mp3', '08 - The First Noel.mp3', '08 - Xmas Lights.mp3', '10 - Jingle Bells (Robbie Hardkiss Remix).mp3', 'desktop.ini', 'Dumpster Diver.mp3', 'kairo.mp3', 'Kairo22.mp3', 'Kalimba.mp3', 'Maid with the Flaxen Hair.mp3', 'Sleep Away.mp3', "Twisted Sister - 00 - We're Not Gonna Take It.mp3"]
>>> dirname = "c:\\"
>>> os.listdir(dirname)
['$APDF', '$Recycle.Bin', "'", '1e7921671ef546b3785beaa0154785', '266e80aa36d3502b41c2bc7e242c78', '2db3954c8eefb3388b725ef79234', '5ab5ebbcfa8e2f95c3fe88', '6d3ce5c42ab139b94c4797e101fe', '6e2f869b9a48a4ef615ceaaa3857fe', '7b8e11cc96e8560164c15d61', '81f331dc183ced94edcc', '8539b20ca2d0ba9560712af1cb05', 'ACER', 'AdwCleaner', 'ANDROME NV', 'Art Explosion',
>>> [f for f in os.listdir(dirname)
if os.path.isfile(os.path.join(dirname, f))]
["'", 'AVScanner.ini', 'bootmgr', 'BOOTSECT.BAK', 'bootsqm.dat', 'eclipse-standard-luna-R-win32-x86_64.zip', 'eula.1028.txt', 'eula.1031.txt', 'eula.1033.txt', 'eula.1036.txt', 'eula.1040.txt', 'eula.1041.txt', 'eula.1042.txt', 'eula.2052.txt', 'eula.3082.txt', 'EUMONBMP.SYS', 'EventLOG.txt', 'GingerSetup.log', 'GingerSetupHelper.log', 'globdata.ini', 'hiberfil.sys',
>>> [f for f in os.listdir(dirname)
if os.path.isdir(os.path.join(dirname, f))]
['$APDF', '$Recycle.Bin', '1e7921671ef546b3785beaa0154785', '266e80aa36d3502b41c2bc7e242c78', '2db3954c8eefb3388b725ef79234', '5ab5ebbcfa8e2f95c3fe88', '6d3ce5c42ab139b94c4797e101fe', '6e2f869b9a48a4ef615ceaaa3857fe', '7b8e11cc96e8560164c15d61', '81f331dc183ced94edcc', '8539b20ca2d0ba9560712af1cb05', 'ACER', 'AdwCleaner', 'ANDROME NV', 'Art Explosion',
>>>

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import os
>>> os.listdir("c:\\music\\_singles\\")
['01 - What I Want For Christmas.mp3', '02 - Fruitcake.mp3', '08 - The First Noel.mp3', '08 - Xmas Lights.mp3', '10 - Jingle Bells (Robbie Hardkiss Remix).mp3', 'desktop.ini', 'Dumpster Diver.mp3', 'kairo.mp3', 'Kairo22.mp3', 'Kalimba.mp3', 'Maid with the Flaxen Hair.mp3', 'Sleep Away.mp3', "Twisted Sister - 00 - We're Not Gonna Take It.mp3"]
>>> dirname = "c:\\"
>>> os.listdir(dirname)
['$APDF', '$Recycle.Bin', "'", '1e7921671ef546b3785beaa0154785', '266e80aa36d3502b41c2bc7e242c78', '2db3954c8eefb3388b725ef79234', '5ab5ebbcfa8e2f95c3fe88', '6d3ce5c42ab139b94c4797e101fe', '6e2f869b9a48a4ef615ceaaa3857fe', '7b8e11cc96e8560164c15d61', '81f331dc183ced94edcc', '8539b20ca2d0ba9560712af1cb05', 'ACER', 'AdwCleaner', 'ANDROME NV', 'Art Explosion',
>>> [f for f in os.listdir(dirname)
if os.path.isfile(os.path.join(dirname, f))]
["'", 'AVScanner.ini', 'bootmgr', 'BOOTSECT.BAK', 'bootsqm.dat', 'eclipse-standard-luna-R-win32-x86_64.zip', 'eula.1028.txt', 'eula.1031.txt', 'eula.1033.txt', 'eula.1036.txt', 'eula.1040.txt', 'eula.1041.txt', 'eula.1042.txt', 'eula.2052.txt', 'eula.3082.txt', 'EUMONBMP.SYS', 'EventLOG.txt', 'GingerSetup.log', 'GingerSetupHelper.log', 'globdata.ini', 'hiberfil.sys',
>>> [f for f in os.listdir(dirname)
if os.path.isdir(os.path.join(dirname, f))]
['$APDF', '$Recycle.Bin', '1e7921671ef546b3785beaa0154785', '266e80aa36d3502b41c2bc7e242c78', '2db3954c8eefb3388b725ef79234', '5ab5ebbcfa8e2f95c3fe88', '6d3ce5c42ab139b94c4797e101fe', '6e2f869b9a48a4ef615ceaaa3857fe', '7b8e11cc96e8560164c15d61', '81f331dc183ced94edcc', '8539b20ca2d0ba9560712af1cb05', 'ACER', 'AdwCleaner', 'ANDROME NV', 'Art Explosion',
>>>

</pre>


<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}

Listing Directories in fileinfo.py

fileinfo.py

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">

"""Framework for getting filetype-specific metadata.

Instantiate appropriate class with filename. Returned object acts like a
dictionary, with key-value pairs for each piece of metadata.
    import fileinfo
    info = fileinfo.MP3FileInfo("/music/ap/mahadeva.mp3")
    print "\\n".join(["%s=%s" % (k, v) for k, v in info.items()])

Or use listDirectory function to get info on all files in a directory.
    for info in fileinfo.listDirectory("/music/ap/", [".mp3"]):
        ...

Framework can be extended by adding classes for particular file types, e.g.
HTMLFileInfo, MPGFileInfo, DOCFileInfo. Each class is completely responsible for
parsing its files appropriately; see MP3FileInfo for example.

This program is part of "Dive Into Python", a free Python book for
experienced programmers. Visit http://diveintopython.org/ for the
latest version.
"""

__author__ = "Mark Pilgrim (This email address is being protected from spambots. You need JavaScript enabled to view it.)"
__version__ = "$Revision: 1.3 $"
__date__ = "$Date: 2004/05/05 21:57:19 $"
__copyright__ = "Copyright (c) 2001 Mark Pilgrim"
__license__ = "Python"

import os
import sys
from UserDict import UserDict

def stripnulls(data):
    "strip whitespace and nulls"
    return data.replace("\00", " ").strip()

class FileInfo(UserDict):
    "store file metadata"
    def __init__(self, filename=None):
        UserDict.__init__(self)
        self["name"] = filename

class MP3FileInfo(FileInfo):
    "store ID3v1.0 MP3 tags"
    tagDataMap = {"title" : ( 3, 33, stripnulls),
                "artist" : ( 33, 63, stripnulls),
                "album" : ( 63, 93, stripnulls),
                "year" : ( 93, 97, stripnulls),
                "comment" : ( 97, 126, stripnulls),
                "genre" : (127, 128, ord)}

    def __parse(self, filename):
        "parse ID3v1.0 tags from MP3 file"
        self.clear()
        try:
            fsock = open(filename, "rb", 0)
            try:
                fsock.seek(-128, 2)
                tagdata = fsock.read(128)
            finally:
                fsock.close()
            if tagdata[:3] == 'TAG':
                for tag, (start, end, parseFunc) in self.tagDataMap.items():
                    self[tag] = parseFunc(tagdata[start:end])
        except IOError:
            pass

    def __setitem__(self, key, item):
        if key == "name" and item:
            self.__parse(item)
        FileInfo.__setitem__(self, key, item)

def listDirectory(directory, fileExtList):
    "get list of file info objects for files of particular extensions"
    fileList = [os.path.normcase(f) for f in os.listdir(directory)]
    fileList = [os.path.join(directory, f) for f in fileList \
                if os.path.splitext(f)[1] in fileExtList]
    def getFileInfoClass(filename, module=sys.modules[FileInfo.__module__]):
        "get file info class from filename extension"
        subclass = "%sFileInfo" % os.path.splitext(filename)[1].upper()[1:]
        return hasattr(module, subclass) and getattr(module, subclass) or FileInfo
    return [getFileInfoClass(f)(f) for f in fileList]

if __name__ == "__main__":
    for info in listDirectory("/music/_singles/", [".mp3"]):
        print ("\n".join(["%s=%s" % (k, v) for k, v in info.items()]))
        print


</pre>

<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}

Listing Directories with glob

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import os
>>> os.listdir("c:\\music\\_singles\\")
['01 - What I Want For Christmas.mp3', '02 - Fruitcake.mp3', '08 - The First Noel.mp3', '08 - Xmas Lights.mp3', '10 - Jingle Bells (Robbie Hardkiss Remix).mp3', 'desktop.ini', 'Dumpster Diver.mp3', 'kairo.mp3', 'Kairo22.mp3', 'Kalimba.mp3', 'Maid with the Flaxen Hair.mp3', 'Sleep Away.mp3', "Twisted Sister - 00 - We're Not Gonna Take It.mp3"]
>>> import glob
>>> glob.glob('c:\\music\\_singles\\*.mp3')
['c:\\music\\_singles\\01 - What I Want For Christmas.mp3', 'c:\\music\\_singles\\02 - Fruitcake.mp3', 'c:\\music\\_singles\\08 - The First Noel.mp3', 'c:\\music\\_singles\\08 - Xmas Lights.mp3', 'c:\\music\\_singles\\10 - Jingle Bells (Robbie Hardkiss Remix).mp3', 'c:\\music\\_singles\\Dumpster Diver.mp3', 'c:\\music\\_singles\\kairo.mp3', 'c:\\music\\_singles\\Kairo22.mp3', 'c:\\music\\_singles\\Kalimba.mp3', 'c:\\music\\_singles\\Maid with the Flaxen Hair.mp3', 'c:\\music\\_singles\\Sleep Away.mp3', "c:\\music\\_singles\\Twisted Sister - 00 - We're Not Gonna Take It.mp3"]
>>> glob.glob('c:\\music\\_singles\\s*.mp3')
['c:\\music\\_singles\\Sleep Away.mp3']
>>> glob.glob('c:\\music\\*\\*.mp3')
['c:\\music\\_singles\\01 - What I Want For Christmas.mp3', 'c:\\music\\_singles\\02 - Fruitcake.mp3', 'c:\\music\\_singles\\08 - The First Noel.mp3', 'c:\\music\\_singles\\08 - Xmas Lights.mp3', 'c:\\music\\_singles\\10 - Jingle Bells (Robbie Hardkiss Remix).mp3', 'c:\\music\\_singles\\Dumpster Diver.mp3', 'c:\\music\\_singles\\kairo.mp3', 'c:\\music\\_singles\\Kairo22.mp3', 'c:\\music\\_singles\\Kalimba.mp3', 'c:\\music\\_singles\\Maid with the Flaxen Hair.mp3', 'c:\\music\\_singles\\Sleep Away.mp3', "c:\\music\\_singles\\Twisted Sister - 00 - We're Not Gonna Take It.mp3"]
>>>

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">
Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import os
>>> os.listdir("c:\\music\\_singles\\")
['01 - What I Want For Christmas.mp3', '02 - Fruitcake.mp3', '08 - The First Noel.mp3', '08 - Xmas Lights.mp3', '10 - Jingle Bells (Robbie Hardkiss Remix).mp3', 'desktop.ini', 'Dumpster Diver.mp3', 'kairo.mp3', 'Kairo22.mp3', 'Kalimba.mp3', 'Maid with the Flaxen Hair.mp3', 'Sleep Away.mp3', "Twisted Sister - 00 - We're Not Gonna Take It.mp3"]
>>> import glob
>>> glob.glob('c:\\music\\_singles\\*.mp3')
['c:\\music\\_singles\\01 - What I Want For Christmas.mp3', 'c:\\music\\_singles\\02 - Fruitcake.mp3', 'c:\\music\\_singles\\08 - The First Noel.mp3', 'c:\\music\\_singles\\08 - Xmas Lights.mp3', 'c:\\music\\_singles\\10 - Jingle Bells (Robbie Hardkiss Remix).mp3', 'c:\\music\\_singles\\Dumpster Diver.mp3', 'c:\\music\\_singles\\kairo.mp3', 'c:\\music\\_singles\\Kairo22.mp3', 'c:\\music\\_singles\\Kalimba.mp3', 'c:\\music\\_singles\\Maid with the Flaxen Hair.mp3', 'c:\\music\\_singles\\Sleep Away.mp3', "c:\\music\\_singles\\Twisted Sister - 00 - We're Not Gonna Take It.mp3"]
>>> glob.glob('c:\\music\\_singles\\s*.mp3')
['c:\\music\\_singles\\Sleep Away.mp3']
>>> glob.glob('c:\\music\\*\\*.mp3')
['c:\\music\\_singles\\01 - What I Want For Christmas.mp3', 'c:\\music\\_singles\\02 - Fruitcake.mp3', 'c:\\music\\_singles\\08 - The First Noel.mp3', 'c:\\music\\_singles\\08 - Xmas Lights.mp3', 'c:\\music\\_singles\\10 - Jingle Bells (Robbie Hardkiss Remix).mp3', 'c:\\music\\_singles\\Dumpster Diver.mp3', 'c:\\music\\_singles\\kairo.mp3', 'c:\\music\\_singles\\Kairo22.mp3', 'c:\\music\\_singles\\Kalimba.mp3', 'c:\\music\\_singles\\Maid with the Flaxen Hair.mp3', 'c:\\music\\_singles\\Sleep Away.mp3', "c:\\music\\_singles\\Twisted Sister - 00 - We're Not Gonna Take It.mp3"]
>>>


</pre>

<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}