Short story: In flash, bitmapData.lock() makes drawing faster by only updating the bitmap on screen when calling bitmapData.unlock() after done drawing, but once the bitmap is unlocked the whole bitmap is redrawn, instead of just the changed parts. If you only change small parts of a bitmap displayed on screen, and you do it often, this can result in a big performance hit.
I had been trying to optimize rendering of the canvas in Pyxel Edit without much luck, but then I tried Adobe Scout (awesome application btw!) that let me know that most of the time in the application was spent by the flash runtime rendering the display list. Apparently the whole canvas was re-drawn each time it changed just a bit. The odd thing was that the brush preview bitmap which is displayed on top was only rendered where it changed.
After some experimenting I found out that locking the bitmapdata of a bitmap object, which supposedly makes drawing faster, invalidates the whole bitmap and makes it get redrawn. Commenting out canvasBitmap.lock() immediately cut the displaylist rendering time by about 1/3 when drawing in the application.
To see what parts of the displaylist are redrawn one handy function is flash.profiler.showRedrawRegions(). It will draw rectangles around each redrawn region.