py/map: Convert map implementation to preserve insertion order. #10
py/map: Convert map implementation to preserve insertion order. #10andrewleech wants to merge 55 commits intomasterfrom
Conversation
|
Code size report: |
|
/review |
There was a problem hiding this comment.
Several blocking issues need addressing before this can land. The most critical are: an out-of-bounds read in mp_map_init_copy that ASAN catches on dict_fixed.py, a memory leak from m_del being called with the wrong byte size on every map dealloc/rehash, and heap-locked allocation failures in settrace_stackless (heapalloc.py, extreme_exc.py, heapalloc_super.py crash with MemoryError: memory allocation failed, heap is locked). DEVLOG.md should not be committed to the repository. Also missing: a MICROPY_PY_MAP_ORDERED compile-time toggle to select between implementations (noted as TODO in the DEVLOG itself).
CI Issues: sanitize_address fails on basics/dict_fixed.py (out-of-bounds read in mp_map_init_copy). settrace_stackless fails on micropython/heapalloc.py, micropython/extreme_exc.py, and micropython/heapalloc_super.py — all crash with MemoryError: memory allocation failed, heap is locked, indicating the new implementation triggers heap allocation during previously allocation-free operations.
DEVLOG.md
Outdated
| @@ -0,0 +1,314 @@ | |||
| # MicroPython Ordered Dict Development Log | |||
There was a problem hiding this comment.
Don't commit this to the repo.
py/map.c
Outdated
| map->is_ordered = src->is_ordered; | ||
| size_t n = MP_MAP_TABLE_BYTE_SIZE(map->alloc); | ||
| map->table = m_malloc0(n); | ||
| memcpy(map->table, src->table, n); |
There was a problem hiding this comment.
src->table for an ordered map (including ROM maps) has only src->alloc * sizeof(mp_map_elem_t) bytes — there are no hash indices appended. Copying MP_MAP_TABLE_BYTE_SIZE(src->alloc) bytes is an out-of-bounds read. This is what ASAN catches in dict_fixed.py. Need:
size_t n = src->is_ordered
? src->alloc * sizeof(mp_map_elem_t)
: MP_MAP_TABLE_BYTE_SIZE(src->alloc);| mp_map_lookup(map, old_table[i].key, MP_MAP_LOOKUP_ADD_IF_NOT_FOUND)->value = old_table[i].value; | ||
| } | ||
| } | ||
| m_del(mp_map_elem_t, old_table, old_alloc); |
There was a problem hiding this comment.
m_del(mp_map_elem_t, old_table, old_alloc) frees old_alloc * sizeof(mp_map_elem_t) bytes, but the allocation was MP_MAP_TABLE_BYTE_SIZE(old_alloc) bytes. Memory leak of MP_MAP_INDEX_SIZE(old_alloc) * old_alloc bytes on every rehash. Same bug in mp_map_compact and in mp_map_deinit (which uses the pre-existing m_del call that now also has the wrong size). Need m_free(old_table, MP_MAP_TABLE_BYTE_SIZE(old_alloc)) at all three sites.
| // Skip if dict is now empty (no point compacting an empty dict). | ||
| if (map->filled > 0 && map->used - map->filled > map->filled / 2) { | ||
| mp_map_compact(map); | ||
| // After compact, original slot is invalid; return value in first empty slot |
There was a problem hiding this comment.
After mp_map_compact(), map->used == map->filled and &map->table[map->used] is the next insertion slot. Storing the deleted value there with key = MP_OBJ_NULL is fragile — any reentrant lookup that inserts will clobber that slot before the caller reads slot->value. Pass the value out via an out-param or save it before calling compact.
py/obj.h
Outdated
| void mp_map_init(mp_map_t *map, size_t n); | ||
| void mp_map_init_fixed_table(mp_map_t *map, size_t n, const mp_obj_t *table); | ||
| void mp_map_init_copy(mp_map_t *map, const mp_map_t *src); | ||
| mp_map_t *mp_map_new(size_t n); |
There was a problem hiding this comment.
mp_map_new and mp_map_free are declared here but not defined anywhere in this diff.
py/map.c
Outdated
| // Macros and functions to deal with key/value table and hash table. | ||
| // map->table points to the key/value table, then the hash table follows, | ||
| // which can be uint8_t, uint16_t, or uint32_t depending on allocation size. | ||
| #define MP_MAP_IS_UINT8(alloc) ((alloc) < 255) |
There was a problem hiding this comment.
Should be < 256. With 1-based indexing, alloc=255 stores max index 255 which fits in uint8_t.
| #define MP_MAP_IS_UINT8(alloc) ((alloc) < 255) | |
| #define MP_MAP_IS_UINT8(alloc) ((alloc) < 256) |
py/mpconfig.h
Outdated
|
|
||
| // Whether to support dicts with >65535 elements (requires uint32_t hash indices) | ||
| #ifndef MICROPY_PY_MAP_LARGE | ||
| #define MICROPY_PY_MAP_LARGE (1) |
There was a problem hiding this comment.
Should default to 0 — most embedded targets won't need >65535-element dicts, and enabling this by default adds the uint32_t index paths to all builds.
py/obj.h
Outdated
| size_t used : (8 * sizeof(size_t) - 3); | ||
| size_t used : (8 * sizeof(size_t) - 3); // next insertion index in dense array | ||
| size_t alloc; | ||
| size_t filled; // number of non-deleted entries (for O(1) len()) |
There was a problem hiding this comment.
Adding a full size_t filled costs +4 or +8 bytes per mp_map_t. Every module dict, class dict, and instance dict pays this. Is there a way to track it more cheaply, e.g., derive it from used minus the tombstone count?
This commit updates the documentation for the `re` library, officially documenting non-capturing grouping rules (ie. "(?:...)"). The documentation mistakenly marked that feature as not supported, but is is indeed supported in the current iteration of the regex library. This closes micropython#18900. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This reverts commit 046013a. Looks like since the latest round of GitHub Actions updates, the Cache LRU algorithm is working as designed again. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Both the overall IRQ line and the per-channel IRQ, for good measure. Otherwise, soft reset will remove the handler before the finaliser for the DMA object(s) run and trigger IRQs if the channel is still active. Closes micropython#18765 This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Small tweak to avoid changes in other targets' lockfiles from printing warnings when building esp32 port. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Not currently building, and too many versions to concurrently support. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Also rename the prefix from can to pyb_can, in anticipation of machine.CAN tests. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Closes micropython#18922 Signed-off-by: Angus Gratton <angus@redyak.com.au>
The function arguments mean totally different things for Classic vs FDCAN hardware, but the argument name wasn't particularly clear for either. This commit shouldn't really change the binary firmware at all. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Simplifies the pattern of an optional arg which can be a list of at least a certain length, otherwise one is lazily initialised. Modify pyb.CAN and ESP-NOW APIs to use the helper. Note this changes the return type of pyb.CAN.recv() from tuple to list. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
API is different to the original machine.CAN proposal, as numerous shortcomings were found during initial implementation. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
These are oddly missing from the STM32G4 HAL, but the reference manual describes being able to use them and the implementations seem to work as expected. Note that unlike STM32H7 it doesn't seem like we must use this approach, because HAL_FDCAN_AddMessageToTxFifoQ() does seem to not have the issues with priority inversion seen on the H7. However it's simpler to use the same API for both... Signed-off-by: Angus Gratton <angus@redyak.com.au> Signed-off-by: Angus Gratton <angus@redyak.com.au>
Implemented according to API docs in a parent comment. Adds new multi_extmod/machine_can_* tests which pass when testing between NUCLEO_G474RE, NUCLEO_H723ZG and PYBDV11. This work was mostly funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
The DAR register field is for auto-retransmit, FDCAN doesn't support automatic restart to clear Bus Off. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>
Some MCUs (eg N6) have more timers which are 32-bit, and it's best to use this macro to work that out. Signed-off-by: Damien George <damien@micropython.org>
This functionality already exists in the TIM code, and can be reused by the upcoming PWM implementation. Signed-off-by: Damien George <damien@micropython.org>
This commit implements the standard `machine.PWM` class on stm32, using the common bindings in `extmod/machine_pwm.c`. Features implemented are: - construct a PWM object from a pin, with automatic selection of TIM instance and channel; - get and set freq, duty_u16 and duty_ns; - optionally invert the output. The PWM objects are static objects (partly in ROM, partly in RAM) so creating a PWM instance on the same pin will return exactly the same object. That's consistent with other peripherals in the stm32 port, and consistent with other PWM implementations (eg rp2). When creating a PWM object on a pin, if that pin has multiple TIM instances then only the first will be selected. A future extension could allow selecting the TIM/channel (eg similar to how ADCBlock allows selecting an ADC). Signed-off-by: Damien George <damien@micropython.org>
When assigning a TIMx_CHy to a pin, the second available alternate function is chosen (or the first if there is only one). This gives better overall static allocation of TIM's to pins. On most MCUs (eg F4, F7, H5, H7) picking the second gives TIM5_CH[1-4] for PA0-PA3, and TIM5 is a 32-bit timer. That leaves TIM2 (also usually on PA0-PA3) for other pins that only have TIM2. For STM32G0, STM32L432 and STM32L452 the heuristic is to simply use the first available alternate function because that gives TIM2 (a 32-bit timer) on PA0-PA3. The above heuristic guarantees that PA0-PA3 always get a 32-bit timer on all supported MCUs. Signed-off-by: Damien George <damien@micropython.org>
To be slightly more accurate computing the expected low/high times for the PWM output. Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
This allows the newly-added `machine.PWM` class to fit on these boards, which is arguably more useful than the features disabled in this commit. Signed-off-by: Damien George <damien@micropython.org>
So that TIM2_CH1 can be used. Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Add DMA, NPU and PDM IRQ priorities to irq.h. Signed-off-by: iabdalkader <i.abdalkader@gmail.com>
2.25k Seems necessary so it doesn't crash `thread/thread_stacksize1.py`. But 2.5k gives a little extra headroom to make that test actually pass. Signed-off-by: Damien George <damien@micropython.org>
With the recent addition of `machine.PWM` and `machine.CAN`, the internal flash of PYBD_SF3 overflows by about 300 bytes. This commit moves the inline assembler compiler functions from internal to external QSPI flash. That frees up about 3k internal flash, and shouldn't affect performance. Signed-off-by: Damien George <damien@micropython.org>
qstr literals are of type qstr_short_t (aka uint16_t) for efficiency, but when the type is passed to `mp_printf` it must be cast explicitly to type `qstr`. These locations were found using an experimental gcc plugin for `mp_printf` error checking. Signed-off-by: Jeff Epler <jepler@unpythonic.net>
This adds support for the standard `weakref` module, to make weak references to Python objects and have callbacks for when an object is reclaimed by the GC. This feature was requested by PyScript, to allow control over the lifetime of external proxy objects (distinct from JS<->Python proxies). Addresses issue micropython#646 (that's nearly a 12 year old issue!). Functionality added here: - `weakref.ref(object [, callback])` create a simple weak reference with optional callback to be called when the object is reclaimed by the GC - `weakref.finalize(object, callback, /, *args, **kwargs)` create a finalize object that holds a weak reference to an object and allows more convenient callback usage and state change The new module is enabled at the "everything" level. The implementation aims to be as efficient as possible, by adding another bit-per-block to the garbage collector, the WTB (weak table). Similar to the finalizer bit (FTB), if a GC block has its corresponding WTB bit set then a weak reference to that block is held. The details of that weak reference are stored in a global map, `mp_weakref_map`, which maps weak reference to ref/finalize objects, allowing the callbacks to be efficiently found when the object is reclaimed. With this feature enabled the overhead is: - 1/128th of the available memory is used for the new WTB table (eg a 128k heap now needs an extra 1k for the WTB). - Code size is increased. - At garbage collection time, there is a small overhead to check if the collected objects had weak references. This check is the same as the existing FTB finaliser scan, so shouldn't add much overhead. If there are weak reference objects alive (ref/finalize objects) then additional time is taken to call the callbacks and do some accounting to clean up the used weak reference. Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
Needs a native exp file because native code doesn't print line numbers in the traceback. Signed-off-by: Damien George <damien@micropython.org>
Following a69425b, this is a convenient way to run a subset of tests. Signed-off-by: Damien George <damien@micropython.org>
The webassembly port needs some additional weakref tests due to the fact that garbage collection only happens when Python execution finishes and JavaScript resumes. The `tests/ports/webassembly/heap_expand.py` expected output also needs to be updated because the amount of GC heap got smaller (weakref WTB takes some of the available RAM). Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Damien George <damien@micropython.org>
It takes longer now that weakref is enabled in the coverage build. Signed-off-by: Damien George <damien@micropython.org>
This brings in: - sdcard: Send stop bit after multi-block read/write - sdcard: Compute CRC7 for all SPI commands - sdcard: Add read/write speed test to sdtest - lsm6dsox: Add pedometer support - lsm6dsox: Add pedometer example code - unix-ffi/re: Handle PCRE2_UNSET in group and groups methods - unix-ffi/re: Add tests for empty string match in ffi regex - unix-ffi/machine: Retrieve a unique identifier if one is known - senml/docs: Correct capitalization of 'MicroPython' - unix-ffi/_libc: Extend FreeBSD libc versions range - string: Convert string module to package and import templatelib Signed-off-by: Damien George <damien@micropython.org>
If a port enables t-strings then it is required to have the `string.templatelib` package (at least to run the tests). That's automatically the case if `MICROPY_PY_TSTRINGS` is enabled. If a port freezes in the micropython-lib `string` extension package then the latest version of this package will include the built-in `string.templatelib` classes. So the feature check for t-strings no longer needs to check if they are available. Signed-off-by: Damien George <damien@micropython.org>
Includes a fix to STA teardown to deinit tcpip and clear itf_state. Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Matt Trentini <matt.trentini@gmail.com>
Signed-off-by: Dryw Wade <dryw.wade@sparkfun.com>
Signed-off-by: Matt Trentini <matthew.trentini@planetinnovation.com.au> Signed-off-by: Matt Trentini <matt.trentini@gmail.com>
740bcf8 to
fb2071d
Compare
Signed-off-by: Damien George <damien@micropython.org>
Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
fb2071d to
2197cfe
Compare
Summary
py/map: Convert map implementation to preserve insertion order.
Testing
TBD
Trade-offs and Alternatives
TBD