作者: 紳士 時間: 2025-3-21 22:40 作者: 抒情短詩 時間: 2025-3-22 00:55 作者: FLEET 時間: 2025-3-22 06:57
Towards Portable Online Prediction of Network Utilization Using MPI-Level Monitoringe applications. A key ingredient to make this possible is an accurate prediction of the future network utilization, enabling the runtime to plan the background operations in advance, such as to avoid competing with the application for network bandwidth. In this paper, we propose a portable deep lear作者: 整體 時間: 2025-3-22 10:42 作者: 易彎曲 時間: 2025-3-22 15:03 作者: 易彎曲 時間: 2025-3-22 18:41
Combining Checkpointing and Data Compression to Accelerate Adjoint-Based Optimization Problems available computers. Data compression is an effective strategy to reduce this memory requirement by a certain factor, particularly if some loss in accuracy is acceptable. A popular alternative is checkpointing, where data is stored at selected points in time, and values at other times are recompute作者: monochromatic 時間: 2025-3-23 00:30
Linear Time Algorithms for Multiple Cluster Scheduling and Multiple Strip Packingatio better than 2 unless .. In this paper, we present an algorithm with approximation ratio 2 and running time . for both problems for . (and running time . for .). While a 2 approximation was known before, the running time of the algorithm is at least . in the worst case. Therefore, an . algorithm作者: NAG 時間: 2025-3-23 02:42
Scheduling on Two Unbounded Resources with Communication Costswe consider a platform with two types of machines, each containing an unbounded number of elements. We want to execute an application represented as a Directed Acyclic Graph (DAG) on this platform. Each task of the application has two possible execution times, depending on the type of machine it is 作者: 開始發(fā)作 時間: 2025-3-23 06:22 作者: 冒煙 時間: 2025-3-23 13:17
Toggle: Contention-Aware Task Scheduler for Concurrent Hierarchical Operationss. State-of-the-art approaches for hierarchical locking are unaware of how tasks are scheduled. We propose a lock-contention aware task scheduler which considers the locking request while assigning tasks to threads. We present the design and implementation of ., which exploits nested intervals and w作者: Barrister 時間: 2025-3-23 15:40 作者: FIS 時間: 2025-3-23 21:31 作者: Suppository 時間: 2025-3-24 00:35 作者: reflection 時間: 2025-3-24 02:53
PLB-HAC: Dynamic Load-Balancing for Heterogeneous Accelerator Clustersmputational load among them. Their relative processing speed for each target application is not available in advance and must be computed at runtime. Also, dynamic changes in the environment may cause these processing speeds to change during execution. We propose a Profile-based Load-Balancing algor作者: 專橫 時間: 2025-3-24 09:31 作者: anesthesia 時間: 2025-3-24 11:35
A Comparison of Random Task Graph Generation Methods for Scheduling Problemsamong a set populated mainly with trivial ones, we rely on properties such as the ., which measures how much a task graph can be decomposed into smaller ones. This property and an in-depth analysis of existing random instance generators establish the sub-exponential generic time complexity of the studied problem.作者: 博愛家 時間: 2025-3-24 15:01
https://doi.org/10.1007/978-3-322-90289-4ork-stealing to maximize throughput. Using widely used STMBench7 benchmark, a real-world XML hierarchy, and a state-of-the-art hierarchical locking protocol, we illustrate that . considerably improves the overall application throughput.作者: 圓木可阻礙 時間: 2025-3-24 22:42 作者: 佛刊 時間: 2025-3-24 23:48
Rhinomanometrische Untersuchungenffected source-code locations from instrumentation, allowing the profiler to skip them at runtime and avoiding the associated overhead. At the end, we merge static and dynamic dependences. We evaluated our approach with 38 benchmarks from two benchmark suites and obtained a median reduction of the profiling time by 62% across all the benchmarks.作者: 隱士 時間: 2025-3-25 04:25 作者: 使閉塞 時間: 2025-3-25 08:39
https://doi.org/10.1007/978-3-322-86645-5o successfully reduce the number of hardware counters needed to characterize a parallel region, and that this set of counters can be measured at run time with high accuracy and low overhead using counter multiplexing.作者: nonplus 時間: 2025-3-25 15:24
Werner R. Müller,Thomas M. Schwarbschedule on one cluster and then distributing it onto the other clusters might come in handy in practical approaches. We demonstrate this by presenting a practical algorithm with running time ., without hidden constants, that is an approximation algorithm with ratio 9/4 if the number . of clusters is dividable by 3 and bounded by . otherwise.作者: expository 時間: 2025-3-25 19:02 作者: 密切關系 時間: 2025-3-25 23:32
Aufzüge mit stetig umlaufendem Zugmitteles, we are able to exclude obviously dominated solutions from the design space before scheduling and synthesis. Compared to a standard, multi-criteria optimisation method, we show the benefits of our approach regarding runtime at the design level.作者: 鬼魂 時間: 2025-3-26 04:01
Einleitung und Abgrenzung des Themasances. We evaluated the algorithm using data clustering, matrix multiplication, and bioinformatics applications and compared with existing load-balancing algorithms. PLB-HAC obtained the highest performance gains with more heterogeneous clusters and larger problems sizes, where a more refined load-distribution is required.作者: 堅毅 時間: 2025-3-26 05:02
Accelerating Data-Dependence Profiling with Static Hintsffected source-code locations from instrumentation, allowing the profiler to skip them at runtime and avoiding the associated overhead. At the end, we merge static and dynamic dependences. We evaluated our approach with 38 benchmarks from two benchmark suites and obtained a median reduction of the profiling time by 62% across all the benchmarks.作者: Mawkish 時間: 2025-3-26 08:37 作者: Spongy-Bone 時間: 2025-3-26 16:24
Hardware Counters’ Space Reduction for Code Region Characterizationo successfully reduce the number of hardware counters needed to characterize a parallel region, and that this set of counters can be measured at run time with high accuracy and low overhead using counter multiplexing.作者: 無聊點好 時間: 2025-3-26 20:41 作者: 螢火蟲 時間: 2025-3-26 23:05
Load-Balancing for Parallel Delaunay Triangulationssets, we achieve nearly perfectly balanced partitions and small border triangulations. This almost cuts running time in half compared to non-data-sensitive division schemes on inputs exhibiting an exploitable underlying structure.作者: white-matter 時間: 2025-3-27 02:18 作者: 連系 時間: 2025-3-27 05:45
PLB-HAC: Dynamic Load-Balancing for Heterogeneous Accelerator Clustersances. We evaluated the algorithm using data clustering, matrix multiplication, and bioinformatics applications and compared with existing load-balancing algorithms. PLB-HAC obtained the highest performance gains with more heterogeneous clusters and larger problems sizes, where a more refined load-distribution is required.作者: 聯(lián)想記憶 時間: 2025-3-27 09:57 作者: grenade 時間: 2025-3-27 15:44
,Ortsbewegliche Nahf?rdermittel,ns..In this paper, we combine compression and checkpointing for the first time to compute a realistic seismic inversion. The combination of checkpointing and compression allows larger adjoint computations compared to using only compression, and reduces the recomputation overhead significantly compared to using only checkpointing.作者: 停止償付 時間: 2025-3-27 20:21
https://doi.org/10.1007/978-3-662-67102-3on time of the DAG (also called makespan). We show that the problem is NP-complete for graphs of depth at least three but polynomial for graphs of depth at most two. In addition, we provide polynomial-time algorithms for some usual classes of graphs (trees, series-parallel graphs).作者: BLUSH 時間: 2025-3-27 22:46 作者: deactivate 時間: 2025-3-28 05:11
Combining Checkpointing and Data Compression to Accelerate Adjoint-Based Optimization Problemsns..In this paper, we combine compression and checkpointing for the first time to compute a realistic seismic inversion. The combination of checkpointing and compression allows larger adjoint computations compared to using only compression, and reduces the recomputation overhead significantly compared to using only checkpointing.作者: capsule 時間: 2025-3-28 07:08
Scheduling on Two Unbounded Resources with Communication Costson time of the DAG (also called makespan). We show that the problem is NP-complete for graphs of depth at least three but polynomial for graphs of depth at most two. In addition, we provide polynomial-time algorithms for some usual classes of graphs (trees, series-parallel graphs).作者: myocardium 時間: 2025-3-28 13:22 作者: deceive 時間: 2025-3-28 16:35 作者: 放大 時間: 2025-3-28 20:44 作者: 步兵 時間: 2025-3-29 00:55 作者: engrossed 時間: 2025-3-29 05:52 作者: PANG 時間: 2025-3-29 10:40 作者: Noctambulant 時間: 2025-3-29 12:33
Lecture Notes in Computer Sciencehttp://image.papertrans.cn/e/image/316542.jpg作者: 廢除 時間: 2025-3-29 18:21
https://doi.org/10.1007/978-3-030-29400-7artificial intelligence; clustering; computer architecture; computer systems; CUDA; data communication sy作者: 賞錢 時間: 2025-3-29 20:53 作者: 即席 時間: 2025-3-30 01:24 作者: resuscitation 時間: 2025-3-30 08:00
Konstante Reaktionserscheinungenect all execution paths that can be executed concurrently by identifying multi-valued expressions, i.e. expressions evaluated differently among processes. This can be used to find . in parallel programs. In this paper, we propose a new method that combines a control-flow analysis with a multi-valued作者: neolith 時間: 2025-3-30 09:36 作者: 緊張過度 時間: 2025-3-30 13:48 作者: GLUT 時間: 2025-3-30 18:32
https://doi.org/10.1007/978-3-322-86645-5are performance counters. Our proposal is aimed towards dynamic tuning and, consequently, the metrics must be collected at execution time, which limits the number of metrics that can be measured. Therefore, our main contribution is the definition of a methodology to determine a reduced set of hardwa作者: Flat-Feet 時間: 2025-3-30 23:33 作者: 難取悅 時間: 2025-3-31 01:57
Werner R. Müller,Thomas M. Schwarbatio better than 2 unless .. In this paper, we present an algorithm with approximation ratio 2 and running time . for both problems for . (and running time . for .). While a 2 approximation was known before, the running time of the algorithm is at least . in the worst case. Therefore, an . algorithm作者: stress-response 時間: 2025-3-31 06:17