Introspection in Python How to spy on your Python objects Guide to Python introspection

时间:2021-09-14 09:55:52

Guide to Python introspection https://www.ibm.com/developerworks/library/l-pyint/

Guide to Python introspection

How to spy on your Python objects

 
Published on December 01, 2002
 

What is introspection?

In everyday life, introspection is the act of self-examination. Introspection refers to the examination of one's own thoughts, feelings, motivations, and actions. The great philosopher Socrates spent much of his life in self-examination, encouraging his fellow Athenians to do the same. He even claimed that, for him, "the unexamined life is not worth living." (See Related topics for links to more about Socrates.)

In computer programming, introspection refers to the ability to examine something to determine what it is, what it knows, and what it is capable of doing. Introspection gives programmers a great deal of flexibility and control. Once you've worked with a programming language that supports introspection, you may similarly feel that "the unexamined object is not worth instantiating."

This article introduces the introspection capabilities of the Python programming language. Python's support for introspection runs deep and wide throughout the language. In fact, it would be hard to imagine Python without its introspection features. By the end of this article you should be very comfortable poking inside the hearts and souls of your own Python objects.

We'll begin our exploration of Python introspection in the most general way possible, before diving into more advanced techniques. Some might even argue that the features we begin with don't deserve to be called "introspective." We'll have to agree that whether they fall under the umbrella of introspection or not is open to debate. For the purposes of this article, our only concern is finding the answers to interesting questions.

So let's begin our inquiry, using Python interactively. When we start Python from the command line, we enter the Python shell, where we can enter Python code and get an immediate response from the Python interpreter. (The commands listed in this article will execute properly using Python 2.2.2. You may get different results or errors if using an earlier version. You can download the latest version from the Python Web site [see Related topics].)

Listing 1. Starting the Python interpreter in interactive mode
1
2
3
4
5
$ python
Python 2.2.2 (#1, Oct 28 2002, 17:22:19)
[GCC 3.2 (Mandrake Linux 9.0 3.2-1mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

Once you have Python running and are looking at the Python prompt (>>>), you may be wondering what words will be recognized by Python. Most programming languages have reserved words, or keywords, that have special meaning in that language, and Python is no exception. You may also have noticed that Python suggested we type help to get more information. Perhaps we can ask Python for some help about keywords.

Python's online help utility

Let's start by typing help, as suggested, and see if it gives us any clues about keywords:

Listing 2. Asking Python for help
1
2
>>> help
Type help() for interactive help, or help(object) for help about object.

Since we don't know what object might contain keywords, let's try help() without specifying any particular object:

Listing 3. Starting the help utility
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
>>> help()
 
Welcome to Python 2.2!  This is the online help utility.
 
If this is your first time using Python, you should definitely check out
the tutorial on the Internet at http://www.python.org/doc/tut/.
 
Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules.  To quit this help utility and
return to the interpreter, just type "quit".
 
To get a list of available modules, keywords, or topics, type "modules",
"keywords", or "topics".  Each module also comes with a one-line summary
of what it does; to list the modules whose summaries contain a given word
such as "spam", type "modules spam".
 
help>

We seem to be getting closer, now. Let's enter keywords at the help prompt:

Listing 4. Asking for help with keywords
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
help> keywords
 
Here is a list of the Python keywords.  Enter any keyword to get more help.
 
and                 elif                global              or
assert              else                if                  pass
break               except              import              print
class               exec                in                  raise
continue            finally             is                  return
def                 for                 lambda              try
del                 from                not                 while
 
help> quit
 
You are now leaving help and returning to the Python interpreter.
If you want to ask for help on a particular object directly from the
interpreter, you can type "help(object)".  Executing "help('string')"
has the same effect as typing a particular string at the help> prompt.
>>>

When we typed help(), we were greeted with a message and some instructions, followed by the help prompt. At the prompt, we entered keywords and were shown a list of Python keywords. Having gotten the answer to our question, we then quit the help utility, saw a brief farewell message, and were returned to the Python prompt.

As you can see from this example, Python's online help utility displays information on a variety of topics, or for a particular object. The help utility is quite useful, and does make use of Python's introspection capabilities. But simply using help doesn't reveal how help gets its information. And since the purpose of this article is to reveal all of Python's introspection secrets, we need to quickly go beyond the help utility.

Before we leave help, let's use it to get a list of available modules. Modules are simply text files containing Python code whose names end in .py. If we type help('modules') at the Python prompt, or enter modules at the help prompt, we'll see a long list of available modules, similar to the partial list shown below. Try it yourself to see what modules are available on your system, and to see why Python is considered to come with "batteries included."

Listing 5. Partial listing of available modules
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>>> help('modules')
 
Please wait a moment while I gather a list of all available modules...
 
BaseHTTPServer      cgitb               marshal             sndhdr
Bastion             chunk               math                socket
CDROM               cmath               md5                 sre
CGIHTTPServer       cmd                 mhlib               sre_compile
Canvas              code                mimetools           sre_constants
    <...>
bisect              macpath             signal              xreadlines
cPickle             macurl2path         site                xxsubtype
cStringIO           mailbox             slgc (package)      zipfile
calendar            mailcap             smtpd
cgi                 markupbase          smtplib
 
Enter any module name to get more help.  Or, type "modules spam" to search
for modules whose descriptions contain the word "spam".
 
>>>

The sys module

One module that provides insightful information about Python itself is the sys module. You make use of a module by importing the module and referencing its contents (such as variables, functions, and classes) using dot (.) notation. The sys module contains a variety of variables and functions that reveal interesting details about the current Python interpreter. Let's take a look at some of them. Again, we're going to run Python interactively and enter commands at the Python command prompt. The first thing we'll do is import the sys module. Then we'll enter the sys.executable variable, which contains the path to the Python interpreter:

Listing 6. Importing the sys module
1
2
3
4
5
6
7
$ python
Python 2.2.2 (#1, Oct 28 2002, 17:22:19)
[GCC 3.2 (Mandrake Linux 9.0 3.2-1mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.executable
'/usr/local/bin/python'

When we enter a line of code that consists of nothing more than the name of an object, Python responds by displaying a representation of the object, which, for simple objects, tends to be the value of the object. In this case, since the displayed value is enclosed in quotes, we get a clue that sys.executable is probably a string object. We'll look at other, more precise, ways to determine an object's type later, but simply typing the name of an object at the Python prompt is a quick and easy form of introspection.

Let's look at some other useful attributes of the sys module.

The platform variable tells us which operating system we are on:

The sys.platform attribute
1
2
>>> sys.platform
'linux2'

The current Python version is available as a string, and as a tuple (a tuple contains a sequence of objects):

Listing 8. The sys.version and sys.version_info attributes
1
2
3
4
>>> sys.version
'2.2.2 (#1, Oct 28 2002, 17:22:19) \n[GCC 3.2 (Mandrake Linux 9.0 3.2-1mdk)]'
>>> sys.version_info
(2, 2, 2, 'final', 0)

The maxint variable reflects the highest available integer value:

The sys.maxint attribute
1
2
>>> sys.maxint
2147483647

The argv variable is a list containing command line arguments, if any were specified. The first item, argv[0], is the path of the script that was run. When we run Python interactively this value is an empty string:

Listing 10. The sys.argv attribute
1
2
>>> sys.argv
['']

When we run another Python shell, such as PyCrust (see Related topics for a link to more information on PyCrust), we see something like this:

Listing 11. The sys.argv attribute using PyCrust
1
2
>>> sys.argv[0]
'/home/pobrien/Code/PyCrust/PyCrustApp.py'

The path variable is the module search path, the list of directories in which Python will look for modules during imports. The empty string, '', in the first position refers to the current directory:

Listing 12. The sys.path attribute
1
2
3
4
5
6
7
>>> sys.path
['', '/home/pobrien/Code',
'/usr/local/lib/python2.2',
'/usr/local/lib/python2.2/plat-linux2',
'/usr/local/lib/python2.2/lib-tk',
'/usr/local/lib/python2.2/lib-dynload',
'/usr/local/lib/python2.2/site-packages']

The modules variable is a dictionary that maps module names to module objects for all the currently loaded modules. As you can see, Python loads certain modules by default:

Listing 13. The sys.modules attribute
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> sys.modules
{'stat': <module 'stat' from '/usr/local/lib/python2.2/stat.pyc'>,
'__future__': <module '__future__' from '/usr/local/lib/python2.2/__future__.pyc'>,
'copy_reg': <module 'copy_reg' from '/usr/local/lib/python2.2/copy_reg.pyc'>,
'posixpath': <module 'posixpath' from '/usr/local/lib/python2.2/posixpath.pyc'>,
'UserDict': <module 'UserDict' from '/usr/local/lib/python2.2/UserDict.pyc'>,
'signal': <module 'signal' (built-in)>,
'site': <module 'site' from '/usr/local/lib/python2.2/site.pyc'>,
'__builtin__': <module '__builtin__' (built-in)>,
'sys': <module 'sys' (built-in)>,
'posix': <module 'posix' (built-in)>,
'types': <module 'types' from '/usr/local/lib/python2.2/types.pyc'>,
'__main__': <module '__main__' (built-in)>,
'exceptions': <module 'exceptions' (built-in)>,
'os': <module 'os' from '/usr/local/lib/python2.2/os.pyc'>,
'os.path': <module 'posixpath' from '/usr/local/lib/python2.2/posixpath.pyc'>}

The keyword module

Let's return to our question about Python keywords. Even though help showed us a list of keywords, it turns out that some of help's information is hardcoded. The list of keywords happens to be hardcoded, which isn't very introspective after all. Let's see if we can get this information directly from one of the modules in Python's standard library. If we type help('modules keywords') at the Python prompt we see the following:

Listing 14. Asking for help on modules with keywords
1
2
3
4
5
>>> help('modules keywords')
 
Here is a list of matching modules.  Enter any module name to get more help.
 
keyword - Keywords (from "graminit.c")

So it appears as though the keyword module might contain keywords. By opening the keyword.py file in a text editor we can see that Python does make its list of keywords explicitly available as the kwlist attribute of the keyword module. We also see in the keyword module comments that this module is automatically generated based on the source code of Python itself, guaranteeing that its list of keywords is accurate and complete:

Listing 15. The keyword module's keyword list
1
2
3
4
5
>>> import keyword
>>> keyword.kwlist
['and', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else',
'except', 'exec', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is',
'lambda', 'not', 'or', 'pass', 'print', 'raise', 'return', 'try', 'while', 'yield']

The dir() function

While it's relatively easy to find and import a module, it isn't as easy to remember what each module contains. And you don't always want to have to look at the source code to find out. Fortunately, Python provides a way to examine the contents of modules (and other objects) using the built-in dir() function.

The dir() function is probably the most well-known of all of Python's introspection mechanisms. It returns a sorted list of attribute names for any object passed to it. If no object is specified, dir() returns the names in the current scope. Let's apply dir() to our keyword module and see what it reveals:

Listing 16. The keyword module's attributes
1
2
3
>>> dir(keyword)
['__all__', '__builtins__', '__doc__', '__file__', '__name__',
'iskeyword', 'keyword', 'kwdict', 'kwlist', 'main']

And how about the sys module we looked at earlier?

Listing 17. The sys module's attributes
1
2
3
4
5
6
7
8
9
10
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__name__', '__stderr__',
'__stdin__', '__stdout__', '_getframe', 'argv', 'builtin_module_names',
'byteorder', 'copyright', 'displayhook', 'exc_info', 'exc_type', 'excepthook',
'exec_prefix', 'executable', 'exit', 'getdefaultencoding', 'getdlopenflags',
'getrecursionlimit', 'getrefcount', 'hexversion', 'last_traceback',
'last_type', 'last_value', 'maxint', 'maxunicode', 'modules', 'path',
'platform', 'prefix', 'ps1', 'ps2', 'setcheckinterval', 'setdlopenflags',
'setprofile', 'setrecursionlimit', 'settrace', 'stderr', 'stdin', 'stdout',
'version', 'version_info', 'warnoptions']

Without any argument, dir() returns names in the current scope. Notice how keyword and sys appear in the list, since we imported them earlier. Importing a module adds the module's name to the current scope:

Listing 18. Names in the current scope
1
2
>>> dir()
['__builtins__', '__doc__', '__name__', 'keyword', 'sys']

We mentioned that the dir() function was a built-in function, which means that we don't have to import a module in order to use the function. Python recognizes built-in functions without our having to do anything. And now we see this name, __builtins__, returned by a call to dir(). Perhaps there is a connection here. Let's enter the name __builtins__ at the Python prompt and see if Python tells us anything interesting about it:

Listing 19. What is __builtins__?
1
2
>>> __builtins__
<module '__builtin__' (built-in)>

So __builtins__ appears to be a name in the current scope that's bound to the module object named __builtin__. (Since modules are not simple objects with single values, Python displays information about the module inside angle brackets instead.) Note that if you look for a __builtin__.py file on disk you'll come up empty-handed. This particular module object is created out of thin air by the Python interpreter, because it contains items that are always available to the interpreter. And while there is no physical file to look at, we can still apply our dir() function to this object to see all the built-in functions, error objects, and a few miscellaneous attributes that it contains:

Listing 20. The __builtins__ module's attributes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>>> dir(__builtins__)
['ArithmeticError', 'AssertionError', 'AttributeError', 'DeprecationWarning',
'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False',
'FloatingPointError', 'IOError', 'ImportError', 'IndentationError',
'IndexError', 'KeyError', 'KeyboardInterrupt', 'LookupError', 'MemoryError',
'NameError', 'None', 'NotImplemented', 'NotImplementedError', 'OSError',
'OverflowError', 'OverflowWarning', 'ReferenceError', 'RuntimeError',
'RuntimeWarning', 'StandardError', 'StopIteration', 'SyntaxError',
'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'True', 'TypeError',
'UnboundLocalError', 'UnicodeError', 'UserWarning', 'ValueError', 'Warning',
'ZeroDivisionError', '_', '__debug__', '__doc__', '__import__', '__name__',
'abs', 'apply', 'bool', 'buffer', 'callable', 'chr', 'classmethod', 'cmp',
'coerce', 'compile', 'complex', 'copyright', 'credits', 'delattr', 'dict',
'dir', 'divmod', 'eval', 'execfile', 'exit', 'file', 'filter', 'float',
'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input', 'int',
'intern', 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list',
'locals', 'long', 'map', 'max', 'min', 'object', 'oct', 'open', 'ord', 'pow',
'property', 'quit', 'range', 'raw_input', 'reduce', 'reload', 'repr', 'round',
'setattr', 'slice', 'staticmethod', 'str', 'super', 'tuple', 'type', 'unichr',
'unicode', 'vars', 'xrange', 'zip']

The dir() function works on all object types, including strings, integers, lists, tuples, dictionaries, functions, custom classes, class instances, and class methods. Let's apply dir() to a string object and see what Python returns. As you can see, even a simple Python string has a number of attributes:

Listing 21. String attributes
1
2
3
4
5
6
7
8
9
10
>>> dir('this is a string')
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__',
'__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__',
'__hash__', '__init__', '__le__', '__len__', '__lt__', '__mul__', '__ne__',
'__new__', '__reduce__', '__repr__', '__rmul__', '__setattr__', '__str__',
'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs',
'find', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace',
'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'replace', 'rfind',
'rindex', 'rjust', 'rstrip', 'split', 'splitlines', 'startswith', 'strip',
'swapcase', 'title', 'translate', 'upper', 'zfill']

Try the following examples yourself to see what they return. Note that the # character marks the start of a comment. Everything from the start of the comment to the end of the line is ignored by Python:

Listing 22. Using dir() on other objects
1
2
3
4
5
dir(42)   # Integer (and the meaning of life)
dir([])   # List (an empty list, actually)
dir(())   # Tuple (also empty)
dir({})   # Dictionary (ditto)
dir(dir)  # Function (functions are also objects)

To illustrate the dynamic nature of Python's introspection capabilities, let's look at some examples using dir() on a custom class and some class instances. We're going to define our own class interactively, create some instances of the class, add a unique attribute to only one of the instances, and see if Python can keep all of this straight. Here are the results:

Listing 23. Using dir() on custom classes, class instances, and attributes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
>>> class Person(object):
...     """Person class."""
...     def __init__(self, name, age):
...         self.name = name
...         self.age = age
...     def intro(self):
...         """Return an introduction."""
...         return "Hello, my name is %s and I'm %s." % (self.name, self.age)
...
>>> bob = Person("Robert", 35)   # Create a Person instance
>>> joe = Person("Joseph", 17)   # Create another
>>> joe.sport = "football"       # Assign a new attribute to one instance
>>> dir(Person)      # Attributes of the Person class
['__class__', '__delattr__', '__dict__', '__doc__', '__getattribute__',
'__hash__', '__init__', '__module__', '__new__', '__reduce__', '__repr__',
'__setattr__', '__str__', '__weakref__', 'intro']
>>> dir(bob)         # Attributes of bob
['__class__', '__delattr__', '__dict__', '__doc__', '__getattribute__',
'__hash__', '__init__', '__module__', '__new__', '__reduce__', '__repr__',
'__setattr__', '__str__', '__weakref__', 'age', 'intro', 'name']
>>> dir(joe)         # Note that joe has an additional attribute
['__class__', '__delattr__', '__dict__', '__doc__', '__getattribute__',
'__hash__', '__init__', '__module__', '__new__', '__reduce__', '__repr__',
'__setattr__', '__str__', '__weakref__', 'age', 'intro', 'name', 'sport']
>>> bob.intro()      # Calling bob's intro method
"Hello, my name is Robert and I'm 35."
>>> dir(bob.intro)   # Attributes of the intro method
['__call__', '__class__', '__cmp__', '__delattr__', '__doc__', '__get__',
'__getattribute__', '__hash__', '__init__', '__new__', '__reduce__',
'__repr__', '__setattr__', '__str__', 'im_class', 'im_func', 'im_self']

Documentation strings

One attribute you may have noticed in a lot of our dir() examples is the __doc__ attribute. This attribute is a string containing the comments that describe an object. Python calls this a documentation string, or docstring, and here is how it works. If the first statement of a module, class, method, or function definition is a string, then that string gets associated with the object as its __doc__ attribute. For example, take a look at the docstring for the __builtins__ object. We'll use Python's print statement to make the output easier to read, since docstrings often contain embedded newlines (\n):

Listing 24. Module docstring
1
2
3
4
>>> print __builtins__.__doc__   # Module docstring
Built-in functions, exceptions, and other objects.
 
Noteworthy: None is the `nil' object; Ellipsis represents `...' in slices.

Once again, Python even maintains docstrings on classes and methods that are defined interactively in the Python shell. Let's look at the docstrings for our Person class and its intro method:

Listing 25. Class and method docstrings
1
2
3
4
>>> Person.__doc__         # Class docstring
'Person class.'
>>> Person.intro.__doc__   # Class method docstring
'Return an introduction.'

Because docstrings provide such valuable information, many Python development environments have ways of automatically displaying the docstrings for objects. Let's look at one more docstring, for the dir() function:

Listing 26. Function docstring
1
2
3
4
5
6
7
8
9
10
11
12
>>> print dir.__doc__   # Function docstring
dir([object]) -> list of strings
 
Return an alphabetized list of names comprising (some of) the attributes
of the given object, and of attributes reachable from it:
 
No argument:  the names in the current scope.
Module object:  the module attributes.
Type or class object:  its attributes, and recursively the attributes of
    its bases.
Otherwise:  its attributes, its class's attributes, and recursively the
    attributes of its class's base classes.

Interrogating Python objects

We've mentioned the word "object" several times, but haven't really defined it. An object in a programming environment is much like an object in the real world. A real object has a certain shape, size, weight, and other characteristics. And a real object is able to respond to its environment, interact with other objects, or perform a task. Computer objects attempt to model the objects that surround us in the real world, including abstract objects like documents and schedules and business processes.

Like real-world objects, several computer objects may share common characteristics while maintaining their own minor variations. Think of the books you see in a bookstore. Each physical copy of a book might have a smudge, or a few torn pages, or a unique identification number. And while each book is a unique object, every book with the same title is merely an instance of an original template, and retains most of the characteristics of the original.

The same is true about object-oriented classes and class instances. For example, every Python string is endowed with the attributes we saw revealed by the dir() function. And in a previous example, we defined our own Person class, which acted as a template for creating individual Person instances, each having its own name and age values, while sharing the ability to introduce itself. That's object-orientation.

In computer terms, then, objects are things that have an identity and a value, are of a certain type, possess certain characteristics, and behave in a certain way. And objects inherit many of their attributes from one or more parent classes. Other than keywords and special symbols (like operators, such as +-***/%<>, etc.) everything in Python is an object. And Python comes with a rich set of object types: strings, integers, floats, lists, tuples, dictionaries, functions, classes, class instances, modules, files, etc.

When you have an arbitrary object, perhaps one that was passed as an argument to a function, you may want to know a few things about that object. In this section we're going to show you how to get Python objects to answer questions such as:

  • What is your name?
  • What kind of object are you?
  • What do you know?
  • What can you do?
  • Who are your parents?

Name

Not all objects have names, but for those that do, the name is stored in their __name__ attribute. Note that the name is derived from the object, not the variable that references the object. The following example highlights that distinction:

Listing 27. What's in a name?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ python
Python 2.2.2 (#1, Oct 28 2002, 17:22:19)
[GCC 3.2 (Mandrake Linux 9.0 3.2-1mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> dir()                # The dir() function
['__builtins__', '__doc__', '__name__']
>>> directory = dir      # Create a new variable
>>> directory()          # Works just like the original object
['__builtins__', '__doc__', '__name__', 'directory']
>>> dir.__name__         # What's your name?
'dir'
>>> directory.__name__   # My name is the same
'dir'
>>> __name__             # And now for something completely different
'__main__'

Modules have names, and the Python interpreter itself is considered the top-level, or main, module. When you run Python interactively the local __name__ variable is assigned a value of '__main__'. Likewise, when you execute a Python module from the command line, rather than importing it into another module, its __name__ attribute is assigned a value of '__main__', rather than the actual name of the module. In this way, modules can look at their own __name__ value to determine for themselves how they are being used, whether as support for another program or as the main application executed from the command line. Thus, the following idiom is quite common in Python modules:

Listing 28. Testing for execution or import
1
2
3
4
5
6
7
8
if __name__ == '__main__':
    # Do something appropriate here, like calling a
    # main() function defined elsewhere in this module.
    main()
else:
    # Do nothing. This module has been imported by another
    # module that wants to make use of the functions,
    # classes and other useful bits it has defined.

Type

The type() function helps us determine whether an object is a string or an integer or some other kind of object. It does this by returning a type object, which can be compared to the types defined in the types module:

Listing 29. Am I your type?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
>>> import types
>>> print types.__doc__
Define names for all type symbols known in the standard interpreter.
 
Types that are part of optional modules (e.g. array) are not listed.
 
>>> dir(types)
['BufferType', 'BuiltinFunctionType', 'BuiltinMethodType', 'ClassType',
'CodeType', 'ComplexType', 'DictProxyType', 'DictType', 'DictionaryType',
'EllipsisType', 'FileType', 'FloatType', 'FrameType', 'FunctionType',
'GeneratorType', 'InstanceType', 'IntType', 'LambdaType', 'ListType',
'LongType', 'MethodType', 'ModuleType', 'NoneType', 'ObjectType', 'SliceType',
'StringType', 'StringTypes', 'TracebackType', 'TupleType', 'TypeType',
'UnboundMethodType', 'UnicodeType', 'XRangeType', '__builtins__', '__doc__',
'__file__', '__name__']
>>> s = 'a sample string'
>>> type(s)
<type 'str'>
>>> if type(s) is types.StringType: print "s is a string"
...
s is a string
>>> type(42)
<type 'int'>
>>> type([])
<type 'list'>
>>> type({})
<type 'dict'>
>>> type(dir)
<type 'builtin_function_or_method'>

Identity

We said earlier that every object has an identity, a type, and a value. What's important to note is that more than one variable may refer to the exact same object, and, likewise, variables may refer to objects that look alike (having the same type and value), but have separate and distinct identities. This notion of object identity is particularly important when making changes to objects, such as appending an item to a list, as in the example below where the blist and clist variables both reference the same list object. As you can see in the example, the id() function returns the unique identifier for any given object:

Listing 30. The Bourne ...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
>>> print id.__doc__
id(object) -> integer
 
Return the identity of an object.  This is guaranteed to be unique among
simultaneously existing objects.  (Hint: it's the object's memory address.)
>>> alist = [1, 2, 3]
>>> blist = [1, 2, 3]
>>> clist = blist
>>> clist
[1, 2, 3]
>>> blist
[1, 2, 3]
>>> alist
[1, 2, 3]
>>> id(alist)
145381412
>>> id(blist)
140406428
>>> id(clist)
140406428
>>> alist is blist    # Returns 1 if True, 0 if False
0
>>> blist is clist    # Ditto
1
>>> clist.append(4)   # Add an item to the end of the list
>>> clist
[1, 2, 3, 4]
>>> blist             # Same, because they both point to the same object
[1, 2, 3, 4]
>>> alist             # This one only looked the same initially
[1, 2, 3]

Attributes

We've seen that objects have attributes, and that the dir() function will return a list of these attributes. Sometimes, however, we simply want to test for the existence of one or more attributes. And if an object has the attribute in question, we often want to retrieve that attribute. These tasks are handled by the hasattr() and getattr() functions, as illustrated in this example:

Listing 31. Have an attribute; get an attribute
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
>>> print hasattr.__doc__
hasattr(object, name) -> Boolean
 
Return whether the object has an attribute with the given name.
(This is done by calling getattr(object, name) and catching exceptions.)
>>> print getattr.__doc__
getattr(object, name[, default]) -> value
 
Get a named attribute from an object; getattr(x, 'y') is equivalent to x.y.
When a default argument is given, it is returned when the attribute doesn't
exist; without it, an exception is raised in that case.
>>> hasattr(id, '__doc__')
1
>>> print getattr(id, '__doc__')
id(object) -> integer
 
Return the identity of an object.  This is guaranteed to be unique among
simultaneously existing objects.  (Hint: it's the object's memory address.)

Callables

Objects that represent potential behavior (functions and methods) can be invoked, or called. We can test an object's callability with the callable() function:

Listing 32. Can you do something for me?
1
2
3
4
5
6
7
8
9
>>> print callable.__doc__
callable(object) -> Boolean
 
Return whether the object is callable (i.e., some kind of function).
Note that classes are callable, as are instances with a __call__() method.
>>> callable('a string')
0
>>> callable(dir)
1

Instances

While the type() function gave us the type of an object, we can also test an object to determine if it is an instance of a particular type, or custom class, using the isinstance() function:

Listing 33. Are you one of those?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>>> print isinstance.__doc__
isinstance(object, class-or-type-or-tuple) -> Boolean
 
Return whether an object is an instance of a class or of a subclass thereof.
With a type as second argument, return whether that is the object's type.
The form using a tuple, isinstance(x, (A, B, ...)), is a shortcut for
isinstance(x, A) or isinstance(x, B) or ... (etc.).
>>> isinstance(42, str)
0
>>> isinstance('a string', int)
0
>>> isinstance(42, int)
1
>>> isinstance('a string', str)
1

Subclasses

We mentioned earlier that instances of a custom class inherit their attributes from the class. At the class level, a class may be defined in terms of another class, and will likewise inherit attributes in a hierarchical fashion. Python even supports multiple inheritance, meaning an individual class can be defined in terms of, and inherit from, more than one parent class. The issubclass() function allows us to find out if one class inherits from another:

Listing 34. Are you my mother?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
>>> print issubclass.__doc__
issubclass(C, B) -> Boolean
 
Return whether class C is a subclass (i.e., a derived class) of class B.
>>> class SuperHero(Person):   # SuperHero inherits from Person...
...     def intro(self):       # but with a new SuperHero intro
...         """Return an introduction."""
...         return "Hello, I'm SuperHero %s and I'm %s." % (self.name, self.age)
...
>>> issubclass(SuperHero, Person)
1
>>> issubclass(Person, SuperHero)
0
>>>

Interrogation time

Let's wrap things up by putting together several of the introspection techniques we've covered in the last section. To do so, we're going to define our own function, interrogate(), which prints a variety of information about any object passed to it. Here is the code, followed by several examples of its use:

Listing 35. Nobody expects it
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
>>> def interrogate(item):
...     """Print useful information about item."""
...     if hasattr(item, '__name__'):
...         print "NAME:    ", item.__name__
...     if hasattr(item, '__class__'):
...         print "CLASS:   ", item.__class__.__name__
...     print "ID:      ", id(item)
...     print "TYPE:    ", type(item)
...     print "VALUE:   ", repr(item)
...     print "CALLABLE:",
...     if callable(item):
...         print "Yes"
...     else:
...         print "No"
...     if hasattr(item, '__doc__'):
...         doc = getattr(item, '__doc__')
...     doc = doc.strip()   # Remove leading/trailing whitespace.
...     firstline = doc.split('\n')[0]
...     print "DOC:     ", firstline
...
>>> interrogate('a string')     # String object
CLASS:    str
ID:       141462040
TYPE:     <type 'str'>
VALUE:    'a string'
CALLABLE: No
DOC:      str(object) -> string
>>> interrogate(42)             # Integer object
CLASS:    int
ID:       135447416
TYPE:     <type 'int'>
VALUE:    42
CALLABLE: No
DOC:      int(x[, base]) -> integer
>>> interrogate(interrogate)    # User-defined function object
NAME:     interrogate
CLASS:    function
ID:       141444892
TYPE:     <type 'function'>
VALUE:    <function interrogate at 0x86e471c>
CALLABLE: Yes
DOC:      Print useful information about item.

As you can see in the last example, our interrogate() function even works on itself. You can't get much more introspective than that.

Conclusion

Who knew that introspection could be so simple, and so rewarding? And yet, I must end here with a caution: do not mistake the results of introspection for wisdom. The experienced Python programmer knows that there is always more they do not know, and are therefore not wise at all. The act of programming produces more questions than answers. The only thing good about Python, as we have seen here today, is that it does answer one's questions. As for me, do not feel a need to compensate me for helping you understand these things that Python has to offer. Programming in Python is its own reward. All I ask from my fellow Pythonians is free meals at the public expense.

 

Downloadable resources

Introspection in Python http://zetcode.com/lang/python/introspection/

Introspection in Python

In this part of the Python tutorial, we talk about introspection.

 

Introspection is an act of self examination. In computer programming, introspection is the ability to determine type or properties of objects at runtime. Python programming language has a large support of introspection. Everything in Python is an object. Every object in Python may have attributes and methods. By using introspection, we can dynamically inspect Python objects.

Python dir function

The dir() function returns a sorted list of attributes and methods belonging to an object.

>>> dir(())
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__',
'__getslice__', '__gt__', '__hash__', '__init__', '__iter__', '__le__',
'__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__',
'__str__', '__subclasshook__', 'count', 'index']

Here we see an output of the dir() function for a tuple object.

>>> print(().__doc__)
tuple() -> empty tuple
tuple(iterable) -> tuple initialized from iterable's items If the argument is a tuple, the return value is the same object.

Our investigation showed that there is a __doc__ attribute for a tuple object.

direx.py
#!/usr/bin/python3

# direx.py

import sys

class MyObject(object):

   def __init__(self):
pass def examine(self):
print(self) o = MyObject() print(dir(o))
print(dir([]))
print(dir({}))
print(dir(1))
print(dir())
print(dir(len))
print(dir(sys))
print(dir("String"))

The example examines several objects using the dir() function: a user defined object, native data types, a function, a string, or a number.

Without any argument, dir() returns names in the current scope.

>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__']
>>> import sys
>>>import math, os
>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'math', 'sys']

We execute the dir() function before and after we include some modules.

Python type function

The type() function returns the type of an object.

typefun.py
#!/usr/bin/python3

# typefun.py

import sys

def function():
pass class MyObject(object): def __init__(self):
pass o = MyObject() print(type(1))
print(type(""))
print(type([]))
print(type({}))
print(type(()))
print(type(object))
print(type(function))
print(type(MyObject))
print(type(o))
print(type(sys))

The example print various types of objects to the console screen.

$ ./typefun.py
<class 'int'>
<class 'str'>
<class 'list'>
<class 'dict'>
<class 'tuple'>
<class 'type'>
<class 'function'>
<class 'type'>
<class '__main__.MyObject'>
<class 'module'>

This is the output.

The id() function

The id() returns a special id of an object.

idfun.py
#!/usr/bin/python3

# idfun.py

import sys

def fun(): pass

class MyObject(object):

   def __init__(self):
pass o = MyObject() print(id(1))
print(id(""))
print(id({}))
print(id([]))
print(id(sys))
print(id(fun))
print(id(MyObject))
print(id(o))
print(id(object))

The code example prints ids of various objects, both built-in and custom.

$ ./idfun.py
10914368
139696088742576
139696087935944
139696065155784
139696088325640
139696088244296
21503992
139696087910776
10738720

Python sys module

The sys module provides access to system specific variables and functions used or maintained by the interpreter and to functions that interact strongly with the interpreter. The module allows us to query about the Python environment.

>>> import sys
>>> sys.version
'3.5.2 (default, Nov 17 2016, 17:05:23) \n[GCC 5.4.0 20160609]'
>>> sys.platform
'linux'
>>> sys.path
['', '/usr/lib/python35.zip', '/usr/lib/python3.5', '/usr/lib/python3.5/plat-x86_64-linux-gnu',
'/usr/lib/python3.5/lib-dynload', '/home/janbodnar/.local/lib/python3.5/site-packages',
'/usr/local/lib/python3.5/dist-packages', '/usr/lib/python3/dist-packages']

In the above code we examine the Python version, platform, and search path locations.

We can also use the dir() function to get a full list of variables and functions of the sys module.

>>> sys.executable
'/usr/bin/python3'
>>> sys.argv
['']
>>> sys.byteorder
'little'

The example presents executableargv, and byteorder attributes of the sys module.

>>> sys.executable
'/usr/bin/python3'

The executable is a string giving the name of the executable binary for the Python interpreter, on systems where this makes sense.

>>> sys.argv
['']

This gives a list of command line arguments passed to a Python script.

>>> sys.byteorder
'little'

The byteorder is an indicator of the native byte order. This will have the value 'big' on big-endian (most-significant byte first) platforms, and 'little' on little-endian (least-significant byte first) platforms.

Other introspection

Next we show various other ways of inspecting Python objects.

attrs.py
#!/usr/bin/python3

# attr.py

def fun():
pass print(hasattr(object, '__doc__'))
print(hasattr(fun, '__doc__'))
print(hasattr(fun, '__call__')) print(getattr(object, '__doc__'))
print(getattr(fun, '__doc__'))

The hasattr() function checks if an object has an attribute. The getattr() function returns the contents of an attribute if there are some.

$ ./attr.py
True
True
True
The most base type
None

The isinstance function checks if an objects is an instance of a specific class.

>>> print(isinstance.__doc__)
Return whether an object is an instance of a class or of a subclass thereof. A tuple, as in ``isinstance(x, (A, B, ...))``, may be given as the target to
check against. This is equivalent to ``isinstance(x, A) or isinstance(x, B)
or ...`` etc.

We can get the describtion of a function interactively.

instance.py
#!/usr/bin/python3

# instance.py

class MyObject(object):

   def __init__(self):
pass o = MyObject() print(isinstance(o, MyObject))
print(isinstance(o, object))
print(isinstance(2, int))
print(isinstance('str', str))

As we know, everything is an object in Python; even numbers and strings. The object is a base type of all objects in Python.

$ ./instance.py
True
True
True
True

The issubclass() function checks if a specific class is a derived class of another class.

subclass.py
#!/usr/bin/python3

# subclass.py

class Object(object):

   def __init__(self):
pass class Wall(Object): def __init__(self):
pass print(issubclass(Object, Object))
print(issubclass(Object, Wall))
print(issubclass(Wall, Object))
print(issubclass(Wall, Wall))

In our code example, the Wall class is a subclass of the Object class. Object and Wall are also subclasses of themselves. The Object class is not a subclass of class Wall.

$ ./subclass.py
True
False
True
True

The __doc__ attribute gives some documentation about an object and the __name__ attribute holds the name of the object.

namedoc.py
#!/usr/bin/python3

# namedoc.py

def noaction():
'''A function, which does nothing'''
pass funcs = [noaction, len, str] for i in funcs: print(i.__name__)
print(i.__doc__)
print("-" * 75)

In our example, we create a list of three functions: one custom and two native. We go through the list and print the __name__ and the __doc__ attributes.

$ ./namedoc.py
noaction
A function, which does nothing
---------------------------------------------------------------------------
len
Return the number of items in a container.
---------------------------------------------------------------------------
str
str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.
---------------------------------------------------------------------------

This is the output.

Finally, there is also a callable() function. The function checks if an object is a callable object (a function).

callable.py
#!/usr/bin/python3

# callable.py

class Car(object):

    def setName(self, name):
self.name = name def fun():
pass c = Car() print(callable(fun))
print(callable(c.setName))
print(callable([]))
print(callable(1))

In the code example we check if three objects are callables.

print(callable(fun))
print(callable(c.setName))

The fun() function and the setName() method are callables. (A method is a function bound to an object.)

$ ./callable.py
True
True
False
False

In this part of the Python tutorial, we have talked about introspection in Python. More tools for doing introspection can be found in the inspect module.

Python的自省机制 - 青山牧云人 - 博客园 https://www.cnblogs.com/ArsenalfanInECNU/p/9110262.html

Python的自省机制

 

什么是自省?

在日常生活中,自省(introspection)是一种自我检查行为。

计算机编程中,自省是指这种能力:检查某些事物以确定它是什么、它知道什么以及它能做什么。自省向程序员提供了极大的灵活性和控制力

说的更简单直白一点:自省就是面向对象的语言所写的程序在运行时,能够知道对象的类型。简单一句就是,运行时能够获知对象的类型

例如python, buby, object-C, c++都有自省的能力,这里面的c++的自省的能力最弱,只能够知道是什么类型,而像python可以知道是什么类型,还有什么属性。

最好的理解自省就是通过例子: Type introspection  这里是各种编程语言中自省(introspection)的例子(这个链接里的例子很重要,也许你很难通过叙述理解什么是introspection,但是通过这些例子,一下子你就可以理解了

回到Python,Python中比较常见的自省(introspection)机制(函数用法)有: dir(),type(), hasattr(), isinstance(),通过这些函数,我们能够在程序运行时得知对象的类型,判断对象是否存在某个属性,访问对象的属性。

 dir()

 dir() 函数可能是 Python 自省机制中最著名的部分了。它返回传递给它的任何对象的属性名称经过排序的列表。如果不指定对象,则 dir() 返回当前作用域中的名称。让我们将 dir() 函数应用于 keyword 模块,并观察它揭示了什么:

>>> import keyword
>>> dir(keyword)
['__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', 'iskeyword', 'kwlist', 'main']

 type()
 type() 函数有助于我们确定对象是字符串还是整数,或是其它类型的对象。它通过返回类型对象来做到这一点,可以将这个类型对象与 types 模块中定义的类型相比较:

>>> type(42)
<class 'int'>
>>> type([])
<class 'list'>


  hasattr()

 对象拥有属性,并且 dir() 函数会返回这些属性的列表。但是,有时我们只想测试一个或多个属性是否存在。如果对象具有我们正在考虑的属性,那么通常希望只检索该属性。这个任务可以由 hasattr() 和 getattr() 函数来完成.

>>> hasattr(id, '__doc__')
True

 
 isinstance() 
 可以使用 isinstance() 函数测试对象,以确定它是否是某个特定类型或定制类的实例:

>>> isinstance("python", str)
True

 
 
 

Python自省(introspection) - 大雄 blog http://www.woola.net/detail/2016-08-28-python-object-introspection.html

Python自省 (introspection)

这个也是python彪悍的特性.

什么是自省 (introspection) : 自省是指这种能力:检查某些事物以确定它是什么、它知道什么以及它能做什么。

自省向程序员提供了极大的灵活性和控制力。

自省就是面向对象的语言所写的程序在运行时,所能知道对象的类型.

简单一句就是运行时能够获得对象的类型.比如

type()
dir()
getattr()
hasattr()
isinstance()
...

理解元类 metaclasses

在理解 metaclasses 前,我们需要掌握 Python 的 classes Python 的 classes 定义十分有趣,有点借鉴了 Smalltalk 语言, 当然了 ,我并不懂 Smalltalk 语言!

>>> class ObjectCreator(object):
... pass
...

什么是类, 能够被实例化成 对象 (object) 的代码组织

因此:

  • 它的属性可以被赋值
  • 它可被复制
  • 可以给它添加属性
  • 我们可以向他的方法传参数

...

例如:

>>> print(ObjectCreator) # 因为是对象 ,所以可以被打印
<class '__main__.ObjectCreator'>
>>> def echo(o):
... print(o)
...
>>> echo(ObjectCreator) # 打印对象
<class '__main__.ObjectCreator'>
>>> print(hasattr(ObjectCreator, 'new_attribute'))
False
>>> ObjectCreator.new_attribute = 'foo' # 添加属性
>>> print(hasattr(ObjectCreator, 'new_attribute'))
True
>>> print(ObjectCreator.new_attribute)
foo
>>> ObjectCreatorMirror = ObjectCreator # 赋值
>>> print(ObjectCreatorMirror.new_attribute)
foo
>>> print(ObjectCreatorMirror())
<__main__.ObjectCreator object at 0x8997b4c>

也可以动态的 操 作 它!

e.g:

>>> def choose_class(name):
... if name == 'foo':
... class Foo(object):
... pass
... return Foo # return the class, not an instance
... else:
... class Bar(object):
... pass
... return Bar
...
>>> MyClass = choose_class('foo')
>>> print(MyClass) # the function returns a class, not an instance
<class '__main__.Foo'>
>>> print(MyClass()) # you can create an object from this class
<__main__.Foo object at 0x89c6d4c>

class 创建类是标准方法, 还是需要手动组织代码去 定义或者实例化这个 class

当我们遇到未知的情况怎么办, 例如常见 ORM, 我们需要调用我们没有定义的属性怎么办?

好了 python 有个古老的方案 type()

>>> print(type(1))
<type 'int'>
>>> print(type("1"))
<type 'str'>
>>> print(type(ObjectCreator))
<type 'type'>
>>> print(type(ObjectCreator()))
<class '__main__.ObjectCreator'>

可以看到 ,当我 python 运行中时,可以动态化的生产这个类!

type() 重点!

type 的定义

type(class name,
需要继承的元组(可为空)
一些属性和值(可为空))
# eg: 继承类
>>> class FooChild(Foo):
... pass
>>> FooChild = type('FooChild', (Foo,), {})
>>> print(FooChild)
<class '__main__.FooChild'>
>>> print(FooChild.bar) # bar is inherited from Foo
True
# eg: 定义属性
>>> def echo_bar(self):
... print(self.bar)
...
>>> FooChild = type('FooChild', (Foo,), {'echo_bar': echo_bar})
>>> hasattr(Foo, 'echo_bar')
False
>>> hasattr(FooChild, 'echo_bar')
True
>>> my_foo = FooChild()
>>> my_foo.echo_bar()
True
# eg:
>>> def echo_bar_more(self):
... print('yet another method')
...
>>> FooChild.echo_bar_more = echo_bar_more
>>> hasattr(FooChild, 'echo_bar_more')
True

常规情况我们实例化对象

MyClass = MetaClass()
MyObject = MyClass()
# 使用 type 元类可以这样
MyClass = type('MyClass', (), {})

python 里面万物皆元类

>>> age = 35
>>> age.__class__
<type 'int'>
>>> name = 'bob'
>>> name.__class__
<type 'str'>
>>> def foo(): pass
>>> foo.__class__
<type 'function'>
>>> class Bar(object): pass
>>> b = Bar()
>>> b.__class__
<class '__main__.Bar'>

那么什么是class 的 class ?

>>> age.__class__.__class__
<type 'type'>
>>> name.__class__.__class__
<type 'type'>
>>> foo.__class__.__class__
<type 'type'>
>>> b.__class__.__class__
<type 'type'>

什么是 metaclass

我们先来看看 new 和 init

  • new是一个静态方法,而init是一个实例方法.
  • new方法会返回一个创建的实例,而init什么都不返回.
  • 只有在new返回一个cls的实例时后面的init才能被调用.
  • 当创建一个新实例时调用new,初始化一个实例时用init.

所以,元类就是创建类对象的东西.

如果你愿意你也可以把它叫做'类工厂'.type是Python的内建元类,当然,你也可以创建你自己的元类.

metaclass属性

当你创建一个函数的时候,你可以添加metaclass属性:

class Foo(object):
__metaclass__ = something...
[...]

如果你这么做了,Python就会用元类来创建类Foo.

小心点,这里面有些技巧.

你首先写下class Foo(object,但是类对象Foo还没有在内存中创建.

Python将会在类定义中寻找metaclass.如果找打了就用它来创建类对象Foo.如果没找到,就会默认用type创建类.

把下面这段话反复读几次。

当你写如下代码时 :

class Foo(Bar):
pass

Python将会这样运行:

在Foo中有没有_metaclass属性?

如果有,Python会在内存中通过metaclass创建一个名字为Foo的类对象(我说的是类对象,跟紧我的思路).

如果Python没有找到metaclass,它会继续在Bar(父类)中寻找metaclass属性,并尝试做和前面同样的操作.

如果Python在任何父类中都找不到metaclass,它就会在模块层次中去寻找metaclass,并尝试做同样的操作。

如果还是找不到metaclass,Python就会用内置的type来创建这个类对象。

现在的问题就是,你可以在metaclass中放置些什么代码呢?

答案就是:可以创建一个类的东西。

那么什么可以用来创建一个类呢?type,或者任何使用到type或者子类化type的东东都可以。

自定义元类

元类的主要目的就是为了当创建类时能够自动地改变类.

通常,你会为API做这样的事情,你希望可以创建符合当前上下文的类.

假想一个很傻的例子,你决定在你的模块里所有的类的属性都应该是大写形式。有好几种方法可以办到,但其中一种就是通过在模块级别设定metaclass.

采用这种方法,这个模块中的所有类都会通过这个元类来创建,我们只需要告诉元类把所有的属性都改成大写形式就万事大吉了。

幸运的是,metaclass实际上可以被任意调用,它并不需要是一个正式的类(我知道,某些名字里带有'class'的东西并不需要是一个class,画画图理解下,这很有帮助)。

所以,我们这里就先以一个简单的函数作为例子开始。

元类会自动将你通常传给'type'的参数作为自己的参数传入

def upper_attr(future_class_name, future_class_parents, future_class_attr):
"""
返回一个将属性列表变为大写字母的类对象
""" # 选取所有不以'__'开头的属性,并把它们编程大写
uppercase_attr = {}
for name, val in future_class_attr.items():
if not name.startswith('__'):
uppercase_attr[name.upper()] = val
else:
uppercase_attr[name] = val # 用'type'创建类
return type(future_class_name, future_class_parents, uppercase_attr) __metaclass__ = upper_attr # 将会影响整个模块 class Foo(): # global __metaclass__ won't work with "object" though
# 我们也可以只在这里定义__metaclass__,这样就只会作用于这个类中
bar = 'bip' print(hasattr(Foo, 'bar'))
# 输出: False
print(hasattr(Foo, 'BAR'))
# 输出: True f = Foo()
print(f.BAR)
# 输出: 'bip'

现在让我们再做一次,这一次用一个真正的class来当做元类。

# 请记住,'type'实际上是一个类,就像'str'和'int'一样
# 所以,你可以从type继承
class UpperAttrMetaclass(type):
# __new__ 是在__init__之前被调用的特殊方法
# __new__是用来创建对象并返回它的方法
# 而__init__只是用来将传入的参数初始化给对象
# 你很少用到__new__,除非你希望能够控制对象的创建
# 这里,创建的对象是类,我们希望能够自定义它,所以我们这里改写__new__
# 如果你希望的话,你也可以在__init__中做些事情
# 还有一些高级的用法会涉及到改写__call__特殊方法,但是我们这里不用
def __new__(upperattr_metaclass, future_class_name,
future_class_parents, future_class_attr): uppercase_attr = {}
for name, val in future_class_attr.items():
if not name.startswith('__'):
uppercase_attr[name.upper()] = val
else:
uppercase_attr[name] = val return type(future_class_name, future_class_parents, uppercase_attr)

但是这不是真正的面向对象(OOP).我们直接调用了type,而且我们没有改写父类的new方法。现在让我们这样去处理:

class UpperAttrMetaclass(type):

    def __new__(upperattr_metaclass, future_class_name,
future_class_parents, future_class_attr): uppercase_attr = {}
for name, val in future_class_attr.items():
if not name.startswith('__'):
uppercase_attr[name.upper()] = val
else:
uppercase_attr[name] = val # 重用 type.__new__ 方法
# 这就是基本的OOP编程,没什么魔法
return type.__new__(upperattr_metaclass, future_class_name,
future_class_parents, uppercase_attr)

你可能已经注意到了有个额外的参数upperattr_metaclass,这并没有什么特别的。 类方法的第一个参数总是表示当前的实例,就像在普通的类方法中的self参数一样。

当然了,为了清晰起见,这里的名字我起的比较长。 但是就像self一样,所有的参数都有它们的传统名称。 因此,在真实的产品代码中一个元类应该是像这样的:

class UpperAttrMetaclass(type):

    def __new__(cls, clsname, bases, dct):

        uppercase_attr = {}
for name, val in dct.items():
if not name.startswith('__'):
uppercase_attr[name.upper()] = val
else:
uppercase_attr[name] = val return type.__new__(cls, clsname, bases, uppercase_attr)

如果使用super方法的话,我们还可以使它变得更清晰一些,这会缓解继承 (是的,你可以拥有元类,从元类继承,从type继承)

class UpperAttrMetaclass(type):

    def __new__(cls, clsname, bases, dct):

        uppercase_attr = {}
for name, val in dct.items():
if not name.startswith('__'):
uppercase_attr[name.upper()] = val
else:
uppercase_attr[name] = val return super(UpperAttrMetaclass, cls).__new__(cls, clsname, bases, uppercase_attr)

就是这样,除此之外,关于元类真的没有别的可说的了。

使用到元类的代码比较复杂,这背后的原因倒并不是因为元类本身, 而是因为你通常会使用元类去做一些晦涩的事情,依赖于自省,控制继承等等。

确实,用元类来搞些“黑暗魔法”是特别有用的,因而会搞出些复杂的东西来。 但就元类本身而言,它们其实是很简单的:

拦截类的创建 修改一个类 返回修改之后的类 为什么要用metaclass类而不是函数?

由于metaclass可以接受任何可调用的对象,那为何还要使用类呢,因为很显然使用类会更加复杂啊?

这里有好几个原因:

意图会更加清晰。当你读到UpperAttrMetaclass(type)时,你知道接下来要发生什么。 你可以使用OOP编程。元类可以从元类中继承而来,改写父类的方法。元类甚至还可以使用元类。 你可以把代码组织的更好。当你使用元类的时候肯定不会是像我上面举的这种简单场景,通常都是针对比较复杂的问题。将多个方法归总到一个类中会很有帮助,也会使得代码更容易阅读。 你可以使用new,init以及call这样的特殊方法。它们能帮你处理不同的任务。就算通常你可以把所有的东西都在new里处理掉,有些人还是觉得用init更舒服些。 哇哦,这东西的名字是metaclass,肯定非善类,我要小心! 说了这么多TMD究竟为什么要使用元类?

现在回到我们的大主题上来,究竟是为什么你会去使用这样一种容易出错且晦涩的特性?

好吧,一般来说,你根本就用不上它:

“元类就是深度的魔法,99%的用户应该根本不必为此操心。如果你想搞清楚究竟是否需要用到元类,那么你就不需要它。那些实际用到元类的人都非常清楚地知道他们需要做什么,而且根本不需要解释为什么要用元类。” —— Python界的领袖 Tim Peters 元类的主要用途是创建API。一个典型的例子是Django ORM。

它允许你像这样定义:

class Person(models.Model):
name = models.CharField(max_length=30)
age = models.IntegerField()

但是如果你像这样做的话:

guy = Person(name='bob', age='35')
print(guy.age)

这并不会返回一个IntegerField对象,而是会返回一个int,甚至可以直接从数据库中取出数据。

这是有可能的,因为models.Model定义了metaclass, 并且使用了一些魔法能够将你刚刚定义的简单的Person类转变成对数据库的一个复杂hook。

Django框架将这些看起来很复杂的东西通过暴露出一个简单的使用元类的API将其化简,通过这个API重新创建代码,在背后完成真正的工作。

结语

首先,你知道了类其实是能够创建出类实例的对象。

好吧,事实上,类本身也是实例,当然,它们是元类的实例。

>>> class Foo(object): pass
>>> id(Foo)
142630324

Python中的一切都是对象,它们要么是类的实例,要么是元类的实例.

除了type.type实际上是它自己的元类,在纯Python环境中这可不是你能够做到的,这是通过在实现层面耍一些小手段做到的。

其次,元类是很复杂的。对于非常简单的类,你可能不希望通过使用元类来对类做修改。你可以通过其他两种技术来修改类: