Commit Graph

208 Commits

Author SHA1 Message Date
f02dc51883 update gitignore 2026-05-14 02:02:27 +09:00
f4a73099a0 refactor project structure and add documents. 2026-05-14 02:00:09 +09:00
a0c0231613 Update JobHandle 2026-05-12 12:18:20 +09:00
ee37cb0954 Refactor memory diagnostics & slot map internals
Renamed UnsafeMemoryDiagnostic to MemoryDiagnostic and updated all usages, including tests. Refactored UnsafeSlotMap<T> and ConcurrentSlotMap<T> to use separate arrays for values, generations, and validity, removing SlotEntry. Ensured all slot-based containers initialize generations to 1. Added MemorySnapshot struct for allocation tracking. Bumped assembly versions, cleaned up benchmarking code, and improved thread safety and initialization logic. Updated vector codegen to include [JsonIgnore].
2026-05-12 12:12:17 +09:00
44c02cce03 Add UnsafeMemoryDiagnostic & job system improvements
Introduce UnsafeMemoryDiagnostic for thread-local memory leak detection, integrated with AllocationManager. Refactor JobScheduler to support external helper threads and clarify thread index handling. Centralize external job execution logic in JobUtility.TryHelpExecuteJob, updating WorkerThread and Wait accordingly. Improve job state transitions, reference counting, and add [StructLayout] to SPMCQueue/JobInfo. Enhance MemoryLeakException diagnostics, update benchmarks/tests to use JobSchedulerDesc, and bump assembly versions.
2026-05-12 00:44:36 +09:00
cd7eb08f1b fix(collection): fixed dead clock problem in UnsafeChunkedList 2026-05-10 22:15:41 +09:00
f4abe2036a Better handling when size is 0 2026-05-10 21:48:35 +09:00
4b9d93ec65 Refactor TLSFAllocator locking, update AddRange signatures
Refactored TLSFAllocator to use a static lock instead of per-instance GCHandle-based locking, removing the Dispose method and related code. Updated allocation methods to use the static lock for thread safety. Removed Dispose call on s_pTLSFAllocator. In UnsafeChunkedList, removed an unused using directive and replaced explicit int types with var in AddRange. Changed UnsafeList<T>.AddRange to accept ReadOnlySpan<T> for broader compatibility. Bumped assembly version to 1.6.25.
2026-05-10 13:06:32 +09:00
99a7e3c4e1 Add UnsafeChunkedList<T> and tests; refactor alloc/util
Introduced high-performance UnsafeChunkedList<T> with parallel-safe add/read, custom enumerator, and chunk management. Added extensive unit tests for all behaviors and concurrency. Refactored AllocationManager zero-replacement logic, improved MemoryUtility alignment methods, and clarified MemoryBlock/UnsafeArray docs. Simplified Program.cs allocation test and updated build constants. Minor cleanups in GGXMipGenerationBenchmark.
2026-05-08 21:53:07 +09:00
30ab3fbefe Remove InvalidArgumentThrow test 2026-05-07 23:26:35 +09:00
259ff36100 Improve FreeList/TLSF allocators: alignment, GC, decommit
- FreeList: enforce min 16B alignment, use GCHandle for SharedState lifetime, switch to AllocZeroed, and use MemoryUtility for oversized allocs
- Add FreeList.CollectLocal() to flush thread-local caches
- TLSF: add decommitted flag, support front splitting for alignment, add Collect() to decommit large free blocks, use Munmap for cleanup
- Add VirtualMemoryBlock for virtual memory management
- Add tests for CollectLocal (FreeList) and Collect (TLSF)
- Update default allocator config and minor .csproj cleanup
2026-05-07 23:25:04 +09:00
d2c165bbe5 Centralize memory ops via MemoryUtility, add VM support
Refactor all memory allocation/deallocation to use MemoryUtility, replacing direct calls with unified methods. Introduce cross-platform virtual memory management (Mmap, Munmap, Decommit, Recommit). Switch to NativeMemory for standard allocations. Enhance FreeList with global free buckets and thread safety. Standardize alignment/size calculations. Remove global usings for memory utils. Bump version to 1.6.24. Includes minor cleanups and improved docs.
2026-05-07 21:34:25 +09:00
f8b11182a9 Refactor allocation flags, SPMD API, and cleanup
Replaced HasFlag with HasOption for allocation flags to avoid boxing and improve performance. Added AllocationOptionExtensions. Reduced FreeListChunkSize default. Removed redundant allocation handle checks. Renamed MultipleAdd to MultiplyAdd in SPMD interfaces and implementations, updating all usages. Expanded SPMD lane interface with new mask/scatter methods and XML docs. Updated GGX jobs and allocation tests. Bumped assembly versions.
2026-05-07 17:07:10 +09:00
264d96ef96 fix(freelist): fixed the memory leak issue of freelist 2026-05-06 19:30:27 +09:00
d3e497c7d8 Add TLSF allocator and refactor allocation API
- Introduced TLSF allocator with thread-safe wrapper and integrated into AllocationManager.
- Extended AllocationManagerDesc for TLSF config; made properties settable.
- Refactored AllocationHandle to encapsulate function pointers and state, replacing direct field access with methods.
- Updated all memory-related structs to use new AllocationHandle API.
- Added ReplaceIfZeros utility to MemoryUtility.
- Improved IndexOfNullByte performance.
- Minor fix in MemoryLeakException output order.
- FreeList now uses a fixed 64KB refill budget.
- Bumped version to 1.6.21; removed MHP_ENABLE_STACKTRACE from Debug.
- Updated Program.cs to test TLSF allocator and manage allocation lifecycle.
2026-05-05 22:13:58 +09:00
627c1da928 Fixed compilation issue when using mimalloc 2026-05-05 16:38:57 +09:00
c9aa3819a0 Merge branch 'main' of https://git.personalnas.com/Misaki/Misaki.HighPerformance 2026-05-05 16:20:26 +09:00
aed4df9ebf Improve FreeList alignment, error handling, and GGX SPMD
- Increased BlockHeader size, added blockStart, and improved alignment logic in FreeList allocator.
- Changed _MIN_BLOCK_SIZE to 32 and consolidated to a single implementation.
- Updated allocation and free logic for correct pointer alignment and header management.
- MemoryUtility now throws OutOfMemoryException on allocation failure.
- Optimized GGXMipGenerationBenchmark SPMD output with MaskScatter and minor math/cleanup improvements.
- Cleaned up Program.cs and enabled global/test initialization.
- Bumped assembly version to 1.6.19.
2026-05-05 16:19:52 +09:00
53bdf79eee Update README.md 2026-05-04 06:49:26 +00:00
155d7b0fbd SPMD API overhaul: gather/scatter, job & packaging updates
- ISPMDLane: add MaskGather, MaskStore, Scatter, MaskScatter; update MaskLoad/Gather signatures for hardware parity
- WideLane/ScalarLane: implement new methods with HW/fallback logic
- MathV: gather/mask-gather now delegate to lane methods
- Vector2/3/4: add CompressStore, Scatter, MaskScatter
- SPMD jobs/tests/README: migrate to new APIs for correctness
- Use Unsafe.BitCast instead of Unsafe.As/AsRef
- Add SPMDUtility for gather index extraction
- Job system: add ICustomJob<TSelf>, ScheduleCustom overload
- FreeList concurrency obsolete; always thread-safe
- NuGet: include LICENSE/README, set license/readme in .csproj
- Docs: update SPMD usage, clarify safety notes
- Minor: doc fixes, CompressStore test improvements
2026-05-04 13:56:49 +09:00
99fcbec753 Refactor SPMD jobs for true vectorized/masked execution
- Change IJobSPMD.Execute to (indices, mask, ctx) signature for all arities, enabling proper vectorized/masked SPMD execution.
- Update all SPMD job wrappers, extension methods, and test jobs to use new interface.
- Add AVX2 gather/masked gather support to MathV.GatherVector2/3/4 and related methods; use [ConstantExpected] byte scale.
- Improve gather/select logic, pointer arithmetic, and overloads for ref/int* index access.
- Refactor GGXMipGenerationBenchmark and jobs for SPMD, with per-mip-level vectorized jobs and improved memory access.
- Clean up code, fix naming, update comments, and bump version to 1.3.6.
2026-05-03 23:32:04 +09:00
4ffb41e210 add LICENSE 2026-05-03 17:11:08 +09:00
9e9339de3c fixed the issue that job can't schedule correctly when have a invalid dependency. 2026-05-03 16:25:47 +09:00
fe8362e029 Add custom job scheduling and dependency combiners
- Introduce `CombinedDependenciesJob` for efficient dependency handling and memory management
- Add `ScheduleCustom<T>` for user-defined job execution/free logic
- Refactor `JobInfo` and `JobDataPool<T>` for safer resource management and custom function support
- Improve SPMD extension type constraint formatting
- Update SPMD project content path and increment assembly versions
- Add unit tests for combined dependencies and custom jobs
- Remove `[Timeout]` from tests to prevent spurious failures
- Add TODO for future `WideLane` optimizations
- Replace legacy .sln with .slnx for better solution structure
2026-05-03 15:17:19 +09:00
997aab299c fixed the problem that job may leak when it's a long running job. 2026-05-02 22:18:19 +09:00
f8edb8ce4c add safety check tp preferLocal 2026-05-02 22:00:56 +09:00
54d0941e62 fixe the locking issue in job scheduler 2026-05-02 21:58:12 +09:00
7bf63c0521 fixed multiple dependencies problem 2026-05-02 21:52:02 +09:00
b801e70f62 add safty check for preferLocal 2026-05-02 21:31:32 +09:00
a048305ebf fixed bug in MarkJobComplete 2026-05-02 20:56:29 +09:00
15d129f29c fixed the issue that multiple dependencies and introduce deadlock. 2026-05-02 20:31:05 +09:00
edfa6d69f0 add timout to test 2026-05-02 20:11:18 +09:00
0af4f987df fallback to bounded freelist 2026-05-02 19:36:18 +09:00
6f2bf18eb4 Change Thread.SpinWait(1) to SpinWait 2026-05-02 19:17:47 +09:00
b15e8359cf reverse the change to freelist 2026-05-02 19:06:58 +09:00
403690ad49 change freelist back 2026-05-02 18:54:14 +09:00
a593139581 temp fallback to old freelist 2026-05-02 18:40:30 +09:00
6f24a0aefa feat(job): fixed the deadlock issue in job system. 2026-05-02 18:37:30 +09:00
1b22b2a308 add more timeout 2026-05-02 18:14:33 +09:00
254cfa5c43 added timeout 2026-05-02 18:10:08 +09:00
1807559487 add runsettings 2026-05-02 18:03:25 +09:00
b0220a2350 feat: implement multi-threaded JobScheduler with worker threads and dependency management 2026-05-02 17:53:45 +09:00
0265a386ba Refactor thread cache management in allocators
Refactored thread-local stack allocator in AllocationManager to use ThreadLocalStackPool, removing global stack pointer arrays and locks. In FreeList, replaced fixed-size cache array and maxConcurrencyLevel with a dynamic linked-list system using SharedState and CacheReclaimer for thread cache lifecycle management. Block headers now store cache pointers instead of indices. Updated allocation/free logic and tests accordingly. Bumped assembly version to 3.1.3.
2026-05-02 16:47:50 +09:00
d6b4074281 Refactor collections to use 'scoped in T' parameters
Updated Add/Remove/Enqueue/Push/etc. methods in core unsafe collections to accept parameters as 'scoped in T' for improved performance and safety. Bumped assembly versions in both csproj files.
2026-05-02 13:52:45 +09:00
eb01e557d5 Relax SPMD job constraints, improve safety and docs
Removed unmanaged struct requirement from SPMD job wrappers and extension methods, allowing managed types. Updated all wrappers and extension methods to require only the interface constraint. Refactored SPMD test jobs to use safe ref-based Store overloads. Improved README and docs with clearer debug/mimalloc instructions and a better SPMD example. Cleaned up Program.cs by removing obsolete experimental code. Enhanced math precision in GGXMipGenerationBenchmark. Updated T4 template to generate new constraints and APIs.
2026-05-01 12:39:37 +09:00
18a181f57a Add AllBitsSet, refactor WideLane, improve math paths
- Add static AllBitsSet property to ISPMDLane and implement in ScalarLane and WideLane
- Refactor WideLane shuffle table pointers and update usages
- Improve pointer safety and mask handling in CompressStore, Gather, and MaskLoad
- Enhance Sin, Cos, SinCos with fast-math and hardware fallback
- Add Newton-Raphson refinement for reciprocal/sqrt when not fast-math
- Optimize MathV.Vector vector loading (struct init, pointer ops)
- Update project file: version 1.3.4, content packaging, AOT settings
- Minor code cleanup and naming consistency fixes
2026-05-01 12:19:58 +09:00
5b4832a886 Refactor SIMD gather, tighten constraints, doc & test opts
- Require TLane : unmanaged, ISPMDLane for stricter type safety and direct memory ops
- Refactor GatherVectorN and WideLane<T>.Gather to use Unsafe.SkipInit and direct assignment, removing stackalloc and TLane.Load for better SIMD performance
- Use Vector.Sum in WideLane<T>.ReduceAdd
- Add/improve XML docs for ReduceAdd/ReduceMax/ReduceMin
- Update test project for AOT, AVX2, speed optimization, and disable reflection
- Tweak GGXMipGenerationBenchmark and Program.cs for improved benchmarking and output
2026-04-30 16:02:18 +09:00
90461cd0ca Add SPMD lane reductions, gather, and SinCos API changes
- Added MaskLoad, Gather, and reduction methods (ReduceAdd, ReduceMax, ReduceMin) to ISPMDLane<TSelf, TNumber> with XML docs
- Changed SinCos to use out parameters instead of tuple return
- Implemented reductions in ScalarLane and WideLane (loop-based, TODO: SIMD)
- Added GetUnsafePtr to ISPMDLane
- Extended MathV to support Sin, Cos, SinCos, Tan, Asin, Acos, Atan, Atan2 for Vector2/3/4
- Improved WideLane.Sequence to use best vector type
- Updated GGX mip generation for new SinCos signature
- Bumped version to 1.3.2
- Enabled PNG dumping in GGX benchmark
2026-04-29 13:26:02 +09:00
b4535eff00 Refactor GGXMipGenerationJobSPMD for SPMD support
Replaced struct with generic SPMD version for SIMD, added type aliases (commented), optimized RadicalInverse_VdC, and adjusted SampleEquirectangularMap for better performance and code separation.
2026-04-29 01:35:12 +09:00
0acaf00767 Refactor trigonometric funcs, optimize GGX benchmark
- Replaced SIMD-based Sin/Cos/SinCos in WideLane with generic polynomial approximations for hardware independence.
- Updated ScalarLane Cast to use CreateTruncating.
- Applied AggressiveOptimization to key GGX methods; improved luma calculation and radical inverse LUT handling.
- Enhanced GGX benchmark setup, cleanup, and timing logic.
- Bumped project version to 1.3.1.
2026-04-28 22:17:59 +09:00