inicio mail me! sindicaci;ón

PyPy Sprint Day 2

Today I continued working on the rctime module with the supervision of Anders. I’m improving my knowledge of rpython and how the interpreter level and application level works and how to make them communicate. This is not really documented in deep so you’ve to check out examples also.

Sad, a swiss boy, joined us and started learning how rctypes work. I think he’s porting zlib module by himself. Now he has a codespeak account so he can work on the live project.

We checked in the fix in format_somobject_error because it was needed by other people that encountered the same problem.

I now know the basics of the machinery of making tests pass and the translation process too.

You can’t use space.unwrap() directly in the interpreter level stuff that goes translated because it’s not supported at all (thanks Armin and Anders!) so you have to use XXX_w functions in the object space.

In the morning I also tried to make my stuff work on Windows but I gave up after installing the .NET SDK 1.1, .NET runtime 1.1, mingw compiler and do some hacking to work around the actual problem that python 2.4 is compiled against msvcrt71.dll but it’s part of the MS VC++ 2003 toolkit that is not available anymore. Visual Studio 2005 provides msvcrt80.dll so I didn’t manage to make it work. Windows is very unfriendly to developers not using all MS tools.

I completed time.sleep(), time.clock(), time.ctime(), time.struct_time, time.gmtime(), time.localtime(), time.mktime()

I have a problem using the translated module on CPython because I have a problem in checking the presence of an optional argument. So basically, if you don’t pass anything to, for example, time.localtime() it fails but if you explicitly pass None as the actual argument it works well. On py.py it works very well. Tonight I’ll try to translate all pypy.

After the sprint we had dinner and after Fabrizio, Alexander, Antonio and I joined summer students at a very funny Dutch party. :-) Summer students here at CERN know how to have fun! They’re obviously physicists or computer scientists from all over the world. From Japan to New York City.

PyPy Sprint Day 1

This is the report of the first day of the PyPy sprint session here at CERN.

The sprint began in the early afternoon (1.30pm) because this morning there was the CERN’s visit.

They started doing a crash-course-tutorial-so-my-head-did-boom presentation. PyPy is really complicated but it’s an interesting challenge. If you want to know more about its architecture and its present capabilities google for PyPy…

After the intro we (about 20 people) decided the topics of the sprint and divided ourselves in pairs (almost). There are the actual topics: planning.txt

So we began working!

I paired with Anders and Brian (a smart south african guy who lives in Barcelona) and we got into deep of the extension compiler machinery and rctypes limitations to port my previous work. We started kinda slow because… because that’s how you learn things. We managed to write some tests and a little functionality (basically time.accept2dyear and time.time()). We also discovered a couple of bugs in the PyPy translation toolchain. One has been fixed: the problem was there because C macros (geterrno specifically) are not supported by rctypes so… we’ve simply wrapped it in a function.

At the time of writing this post I’ve just finished the whole PyPy -> C translation and it worked so we didn’t made it explode (we tested it, but you can’t be sure anyway.) It took 53 minutes eventually. Nice, but I have a dual core 2Ghz CPU and 2Gb of ram…

The other problem is format_someobject_error in pypy.tool.error module. We had an empty block masking our actual block and a strange IndexError bug. So after some digging Anders came out with an hack (it’s not in the repository, only on my machine) to make it work in the translation process. We’ve to find the real reason behind this.

So, that’s it. It’s not very much but believe me… it’s brain-exploding stuff if you are a pypy newbie like me :-)

Anders was very kind to let us ask any kind of questions and provide hints and answers. I’m making progresses and that’s what counts. Hope to finish the all stuff before the summer of code deadline.

UPDATE: I also found myself in a toga party organized by the summer student at CERN so I met a lot of people and some nice girls :-P. Tomorrow night will be dutch-party time… so let’s roll!

Implementing Python with .NET

This is the title of the just finished thesis by Antonio Cuni. He is one of the PyPy developers and the author of the exciting .NET backend named GenCLI.

The thesis is for people (hopefully the entire Python community :-)) who want to understand better the PyPy as a whole and its backend.

You can find it here: Implementing Python with .NET.

memory map your entire mind

mmap porting is finished.

Here’s the latest changelog:

  • correctly raise error if used in concatenation and repeatition
  • responds to getitem, setitem
  • correctly raise error if called delitem
  • support slicing objects
  • add a big test ported from test_mmap.py

The only remaining issue is the lack of support of the Buffer object protocol because is not directly exposed in Python. You can’t pass a map to file.write() or regexp methods. Hope someday the Python guys expose this.

You can find everything at the usual place: http://codespeak.net/svn/user/rhymes/modules if you want to test it…

Now onto the fourth module: bz2.

mmap is almost finished but…

… I’m quite puzzled about a couple of things.

First, I don’t understand when PySequence_GetItem it’s been called. I explain myself:

in mmapmodule.c you can find the following function to implement getitem behavior:

[code lang="c"] static PyObject * mmap_item(mmap_object *self, Py_ssize_t i) { CHECK_VALID(NULL); if (i < 0 || (size_t)i >= self->size) { PyErr_SetString(PyExc_IndexError, “mmap index out of range”); return NULL; } return PyString_FromStringAndSize(self->data + i, 1); } [/code]

What’s not clear from the C api documentation is that PySequence_GetItem is called automatically before passing the numeric index to that function. That means if you pass -1 to the map indexing mechanism in this way:

[code lang="python"] m = mmap.mmap(f.fileno(), size) print m[-1] [/code]

you get it translated to size - 1 automatically.

Update: I dug into CPython more and the truth lies in abstract.c. PyObject_GetItem calls PySequence_GetItem which in turns converts the negative index before calling the actual mmap_item function.

As the title says the mmap porting is almost finished but it misses one thing: buffer-like behavior. If you write something like this with CPython’s mmap

[code lang="python"] m = mmap(fd, size) print f.write(m) [/code]

it works. This triggers getsegcount and getcharbuffer in the mmapmodule.c. This is part of the Buffer Object protocol. How to simulate this in Python? I don’t find anything in the Python documentation.

Update: I know about buffer() existance but as the Python documentation says: Buffer objects are not directly supported by Python syntax, but can be created by calling the builtin function buffer(). They don’t support concatenation or repetition..

mmap methods are completed

Changelog:

  • mmap.size() is done
  • mmap.tell() too
  • mmap.flush() is done
  • I had some problems with memmove() syscall from ctypes so I implemented mmap.move() with memcpy()
  • mmap.resize() works where supported (not supported on OSX and FreeBSD)

I discovered an ugly alignment problem caling msync() syscall but it has been fixed. The problem was in Linux only…

mmap methods are completed, now I have to make it behave as a string and a file like object for real.

Installing stuff in Parallels

Parallels is damn cool and with 2.0Gb of RAM is even cooler. Why it takes 20 minutes to be up and running for the SoC with Ubuntu Dapper and after two hours of installation/configuration/reboot I’ve not completed the setup of the XP box yet?

I’m installing some SDK now…

posted with TextMate

some more mmap

Today changelog:

  • mmap.find() is here
  • mmap.seek() is here
  • mmap.write() is here
  • many fixes to make it work under windows as well
  • big refactoring of tests
  • mmap.write_byte() is here

I do want mutable strings in Python!:-)

mmap is goin’ on

Here’s the today changelog:

  • mmap frees resources correctly
  • close() is done and working
  • read_byte() is done and working
  • readline() is done and working
  • find() needs more work.

Do the bus a bus

The bus error is gone but I’m not really comfortable with the reason why I don’t have that error anymore. Let’s explain a bit:

mmap() is a 6 argument function, all of them required.

Thomas Heller discovered under FreeBSD 6.0 with ktrace that it’s called with 8 parameters, not 6. The two argument in addition are all 0 if called with the 6-arg mmap from plain C.

If you call mmap() from ctypes the 7th parameter is zeroed, the 8th is a random hexadecimal. So it doesn’t work, because C calls it with 0.

A FreeBSD committer friend of mine told me instead that the syscall under mmap() is a 7 argument function, not 8.

Thomas Heller tried with 8 parameters and it worked. I, accordingly to my friend, tried with 7 instead and it worked well anyway.

So the mistery is still here but the error is gone.

You can discover odd things also under Unix, not only under Win32 :-D

Eventually, this is the correct code:

[code lang="python"] from ctypes import * import ctypes.util

_libc = cdll.LoadLibrary(ctypes.util.find_library(”c”))

f = open(”foo”, “w+”) f.write(”foo\0″) f.flush()

_libc.mmap.restype = c_void_p m = _libc.mmap(0, 4, 3, 1, f.fileno(), 0, 0) print m if m != c_void_p(-1).value: c = cast(m, POINTER(c_char)) print [c[i] for i in range(4)] f.close() [/code]

ps. the title is taken from a Busta Rhymes old song ;-)

« Previous entries · Next entries »