Published: 2026-03-12 16:27:32
Authors: John Southworth
Categories: astro-ph.SR
Abstract:
We present an analysis of BS Dra, a detached eclipsing binary containing two almost-identical F3 V stars in a 3.36-d circular orbit, based on 40 sectors of observations from the Transiting Exoplanet Survey Satellite (TESS) and published spectroscopic results. We measure masses of 1.305 +/- 0.015 Msun and 1.284 +/- 0.017 Msun, and radii of 1.409 +/- 0.006 Rsun and 1.400 +/- 0.006 Rsun, for the two components. The high quality of the TESS data allow -- for the first time -- a definitive identification of the primary eclipse, which is 0.007 mag deeper than the secondary. The primary star is the hotter, larger and more massive of the two: the ratios of the radii and surface brightnesses are both slightly but significantly below unity. We find a distance concordant with the Gaia DR3 parallax and, by comparison to theoretical models, an age of 1600 +/- 300 Myr and a slightly sub-solar chemical composition. Our mean times of primary eclipse, each representing all eclipses in one sector, have a scatter of only 0.37 s around a linear ephemeris: BS Dra may be useful as a celestial clock.
Published: 2026-03-12 16:26:38
Authors: Tae-Eun Song
Categories: cs.CL
Abstract:
Large language models struggle to catch errors in their own outputs when the review happens in the same session that produced them. This paper introduces Cross-Context Review (CCR), a straightforward method where the review is conducted in a fresh session with no access to the production conversation history. We ran a controlled experiment: 30 artifacts (code, technical documents, presentation scripts) with 150 injected errors, tested under four review conditions -- same-session Self-Review (SR), repeated Self-Review (SR2), context-aware Subagent Review (SA), and Cross-Context Review (CCR). Over 360 reviews, CCR reached an F1 of 28.6%, outperforming SR (24.6%, p=0.008, d=0.52), SR2 (21.7%, p<0.001, d=0.72), and SA (23.8%, p=0.004, d=0.57). The SR2 result matters most for interpretation: reviewing twice in the same session did not beat reviewing once (p=0.11), which rules out repetition as an explanation for CCR's advantage. The benefit comes from context separation itself. CCR works with any model, needs no infrastructure, and costs only one extra session.
Published: 2026-03-12 16:24:28
Authors: Yirui Zheng, Juntai Shen, Bin-Hui Chen
Categories: astro-ph.GA
Abstract:
We run a suite of $N$-body simulations to investigate how classical bulges affect bar formation and properties under the internal formation mechanism. We incorporate bulges of varying mass and compactness into disk galaxy models and evolve them in isolation to examine the resulting bar pattern speeds and growth timescales. A more massive/compact bulge increases the Toomre $Q$ stability parameter and the circular velocity in the central region, while decreasing the disk mass fraction. It therefore delays the onset of bar formation and increases the bar growth timescale; sufficiently strong bulges can suppress bar formation entirely. During the formation stage, bars exhibit higher initial pattern speeds and faster deceleration rates when the bulges become more massive or compact. This faster deceleration persists after the bar buckling phase, leading to slower-rotating bars in the secular growth stage. However, when the bulge's "diluting" effect on the measured bar strength is removed or reduced, all bars within the same disk share similar distributions in the pattern speed-bar strength ($Ω_p$-$A_2$) space during the secular growth stage. They also show comparable ratios of the co-rotation radius to the bar length ($\mathcal{R}=R_{\mathrm{CR}}/R_{\mathrm {bar}}$) in this stage. These results suggest that the bulge's influence on the pattern speed is more significant during the bar formation stage, while in the secular growth stage, the bulge's effect may be less important, and the disk component dominates the pattern speed evolution.
Published: 2026-03-12 16:20:35
Authors: Jae-Won Chung, Jeff J. Ma, Jisang Ahn, Yizhuo Liang, Akshay Jajoo, Myungjin Lee, Mosharaf Chowdhury
Categories: cs.LG, cs.DC
Abstract:
Any-to-Any models are an emerging class of multimodal models that accept combinations of multimodal data (e.g., text, image, video, audio) as input and generate them as output. Serving these models are challenging; different requests with different input and output modalities traverse different paths through the model computation graph, and each component of the model have different scaling characteristics.
We present Cornserve, a distributed serving system for generic Any-to-Any models. Cornserve provides a flexible task abstraction for expressing Any-to-Any model computation graphs, enabling component disaggregation and independent scaling. The distributed runtime dispatches compute to the data plane via an efficient record-and-replay execution model that keeps track of data dependencies, and forwards tensor data between components directly from the producer to the consumer. Built on Kubernetes with approximately 23K new lines of Python, Cornserve supports diverse Any-to-Any models and delivers up to 3.81$\times$ higher throughput and 5.79$\times$ lower tail latency. Cornserve is open-source, and the demo video is available on YouTube.
Published: 2026-03-12 16:14:14
Authors: Deyu Zou, Yongqiang Chen, Fan Feng, Mufei Li, Pan Li, Yu Gong, James Cheng
Categories: cs.AI
Abstract:
Reinforcement learning (RL) with outcome-based rewards has achieved significant success in training large language model (LLM) agents for complex reasoning tasks. However, in active reasoning where agents need to strategically ask questions to acquire task-relevant information, we find that LLM agents trained with RL often suffer from information self-locking: the agent ceases to ask informative questions and struggles to internalize already-obtained information. To understand the phenomenon, we decompose active reasoning into two core capabilities: Action Selection (AS), which determines the observation stream through queries, and Belief Tracking (BT), which updates the agent's belief based on collected evidence. We show that deficient AS and BT capabilities will limit the information exploration during RL training. Furthermore, insufficient exploration in turn hinders the improvement of AS and BT, creating a feedback loop that locks the agent in a low-information regime. To resolve the issue, we propose a simple yet effective approach that reallocates the learning signal by injecting easy- to-obtain directional critiques to help the agent escape self-locking. Extensive experiments with 7 datasets show that our approach significantly mitigates the information self-locking, bringing up to 60% improvements.
Published: 2026-03-12 16:01:01
Authors: Dimitri Staufer, Kirsten Morehouse, David Hartmann, Bettina Berendt
Categories: cs.HC, cs.AI, cs.CL, cs.CY
Abstract:
Large language models (LLMs) learn statistical associations from massive training corpora and user interactions, and deployed systems can surface or infer information about individuals. Yet people lack practical ways to inspect what a model associates with their name. We report interim findings from an ongoing study and introduce LMP2, a browser-based self-audit tool. In two user studies ($N_{total}{=}458$), GPT-4o predicts 11 of 50 features for everyday people with $\ge$60\% accuracy, and participants report wanting control over LLM-generated associations despite not considering all outputs privacy violations. To validate our probing method, we evaluate eight LLMs on public figures and non-existent names, observing clear separation between stable name-conditioned associations and model defaults. Our findings also contribute to exposing a broader generative AI evaluation crisis: when outputs are probabilistic, context-dependent, and user-mediated through elicitation, what model--individual associations even include is under-specified and operationalisation relies on crafting probes and metrics that are hard to validate or compare. To move towards reliable, actionable human-centred LLM privacy audits, we identify nine frictions that emerged in our study and offer recommendations for future work and the design of human-centred LLM privacy audits.
Published: 2026-03-12 15:48:52
Authors: Olga Bantysh, Ramon Reigada, Rodrigo C. V. Coelho, Pau Guillamat, Jordi Ignés-Mullol, Francesc Sagués
Categories: cond-mat.soft
Abstract:
The emergence of long-range spatiotemporal order from intrinsic chaos is a central challenge in far-from-equilibrium physics. In active fluids, such as cytoskeletal networks driving cellular motion, self-generated flows typically produce "active turbulence", lacking translational symmetry. Here we show that a chaotic active nematic can self-organize into a spatiotemporal crystal, forming a regular lattice of density, orientation, and vorticity that breaks both spatial and temporal translational symmetry. Using a microtubule/kinesin active nematic interfaced with a lamellar liquid crystal and confined in microfluidic channels, we observe robust spatiotemporal lattices without external forcing. The ordering emerges from spontaneous synchronization of intrinsic flow instabilities, mediated by confinement and feedback between the active layer and the passive anisotropic interface. Continuum nematohydrodynamics simulations support our interpretation, highlighting how intrinsic length and time scales shape the active crystals. These results reconcile chaos and crystallinity in active matter and provide a strategy for engineering order in self-driven, far-from-equilibrium soft materials.
Published: 2026-03-12 15:28:18
Authors: Dimitrios Karamitros, Thomas McKelvey, Snehit Panghal, Apostolos Pilaftsis
Categories: quant-ph, hep-ph
Abstract:
We study in detail the dynamics of unstable two-level quantum systems by adopting the Bloch-vector representation. We identify a novel class of critical scenarios in which the so-called energy-level and decay-width vectors, ${\bf E}$ and ${\bfΓ}$, are orthogonal to one another, and the parameter $r = |{\bf Γ}|/(2|{\bf E}|)$ is less than~1. Most remarkably, we find that critical unstable qubit systems exhibit atypical behaviours like coherence--decoherence oscillations when analysed in an appropriately defined co-decaying frame of the system. By making use of a Fourier series decomposition, we define anharmonicity observables that quantify the degree of non-sinusoidal oscillation of a CUQ. We apply the results of our formalism to the neutral-meson systems and derive generic upper limits on these new observables. In particular, we provide a compilation table of all well-explored meson--antimeson two-level systems in terms of Bloch-sphere parameters.
Published: 2026-03-12 15:22:29
Authors: Helong Huang, Michiel Min, Chris W. Ormel, Achrène Dyrek, Nicolas Crouzet
Categories: astro-ph.EP
Abstract:
Context. WASP-107 b has been observed comprehensively by JWST in the near- and mid-IR bands, making it an ideal planet to probe the composition and internal dynamics. Recent analysis reveals a 8-10 um silicate feature, but it still remains uncertain how silicate clouds form on this planet. Aims. We aim at fitting the complete JWST spectrum of WASP-107 b, from 0.9 um to 12 um with a physically motivated cloud model and self-consistent temperature profile. Methods. Two-stream radiative transfer is coupled to a cloud formation model until convergence between cloud and temperature profiles is reached. We search a model grid spanning metallicity, turbulent diffusivity, internal heat flux and nucleation parameters to find the best fit model. Results. The silicate cloud feature at 10 um and the near-IR molecular band strength can be simultaneously and naturally explained without assuming a parametrized temperature profile. A moderate vertical diffusivity of Kzz = 10^9 cm^2 s^-1 is needed to bring the cloud particles to the upper atmosphere of WASP-107 b. This Kzz is favored by the joint fitting of the near-IR water feature and mid-IR silicate feature -- both sensitive to clouds. From the strength of H2O and CO2 bands, our model suggests a metallicity 17 times solar. Conclusions. Even in warm planets such as WASP-107 b, silicate clouds can form in the relatively cool upper atmosphere because turbulence uplifts vapor and cloud particles. Despite having considerably fewer degrees of freedom, the self-consistent modeling approach successfully fits WASP-107 b's multi-wavelength data, instilling confidence in the derived physical parameters.
Published: 2026-03-12 15:22:27
Authors: Umberto Cappellazzo, Stavros Petridis, Maja Pantic
Categories: eess.AS, cs.CV, cs.SD
Abstract:
Audio-Visual Speech Recognition (AVSR) leverages both acoustic and visual information for robust recognition under noise. However, how models balance these modalities remains unclear. We present Dr. SHAP-AV, a framework using Shapley values to analyze modality contributions in AVSR. Through experiments on six models across two benchmarks and varying SNR levels, we introduce three analyses: Global SHAP for overall modality balance, Generative SHAP for contribution dynamics during decoding, and Temporal Alignment SHAP for input-output correspondence. Our findings reveal that models shift toward visual reliance under noise yet maintain high audio contributions even under severe degradation. Modality balance evolves during generation, temporal alignment holds under noise, and SNR is the dominant factor driving modality weighting. These findings expose a persistent audio bias, motivating ad-hoc modality-weighting mechanisms and Shapley-based attribution as a standard AVSR diagnostic.
Published: 2026-03-12 15:19:45
Authors: Krishna Kant Singh, Eric Müller, Eleni Mathioulaki, Wouter Klijn, Lena Oden
Categories: cs.DC
Abstract:
Deploying complex, distributed scientific workflows across diverse HPC sites is often hindered by site-specific dependencies and complex build environments. This paper investigates the design and performance of portable HPC container images capable of encapsulating MPI- and CUDA-enabled software stacks without sacrificing bare-metal performance. This work is part of recent work performed within the EBRAINS Research Infrastructure, to evaluate the implementation of portable HPC (Apptainer-based) container images targeting the EBRAINS Software Distribution (ESD) -- a Spack-based software ecosystem comprising approximately 80 top-level packages (and 800 dependencies). We evaluate a hybrid, PMIx-based containerization strategy using Apptainer that seamlessly bypasses the need for site-specific builds by dynamically leveraging host-level specialized hardware, such as network interfaces and GPUs, on two production HPC clusters: Karolina and Jureca-DC. We demonstrate the feasibility of building portable, MPI- and CUDA-enabled scientific software into container images that correctly leverage site-installed drivers and hardware to reproduce bare-metal communication behavior. Using communication microbenchmarks (e.g., OSU and NCCL) alongside performance metrics of applications from neuroscience, we measure and verify their performance against bare-metal deployments. Crucially, our verification approach extends beyond top-level runtime measurements; we highlight the analysis of underlying debug logs to actively detect misbehavior and misconfigurations, such as suboptimal transport pathways. Ultimately, this investigation demonstrates the feasibility of a simple and reproducible methodology for decoupling software environments from underlying infrastructures, paving the way for automated pipelines that ensure optimized, performance-verified execution across varied HPC architectures.
Published: 2026-03-12 15:14:35
Authors: Valentyn Melnychuk, Vahid Balazadeh, Stefan Feuerriegel, Rahul G. Krishnan
Categories: cs.LG
Abstract:
Foundation models based on prior-data fitted networks (PFNs) have shown strong empirical performance in causal inference by framing the task as an in-context learning problem.However, it is unclear whether PFN-based causal estimators provide uncertainty quantification that is consistent with classical frequentist estimators. In this work, we address this gap by analyzing the frequentist consistency of PFN-based estimators for the average treatment effect (ATE). (1) We show that existing PFNs, when interpreted as Bayesian ATE estimators, can exhibit prior-induced confounding bias: the prior is not asymptotically overwritten by data, which, in turn, prevents frequentist consistency. (2) As a remedy, we suggest employing a calibration procedure based on a one-step posterior correction (OSPC). We show that the OSPC helps to restore frequentist consistency and can yield a semi-parametric Bernstein-von Mises theorem for calibrated PFNs (i.e., both the calibrated PFN-based estimators and the classical semi-parametric efficient estimators converge in distribution with growing data size). (3) Finally, we implement OSPC through tailoring martingale posteriors on top of the PFNs. In this way, we are able to recover functional nuisance posteriors from PFNs, required by the OSPC. In multiple (semi-)synthetic experiments, PFNs calibrated with our martingale posterior OSPC produce ATE uncertainty that (i) asymptotically matches frequentist uncertainty and (ii) is well calibrated in finite samples in comparison to other Bayesian ATE estimators.
Published: 2026-03-12 15:11:23
Authors: Gauhar Abbas
Categories: hep-ph, hep-ex
Abstract:
We discuss that conventional Technicolour dynamics can be revitalized within the Dark Technicolour paradigm by invoking the Extended Most Attractive Channel hypothesis. In this framework, Standard Model fermions acquire masses via multifermion chiral condensates arising from new strong dynamics. The model incorporates three confining gauge sectors, Technicolour, Dark Technicolour, and an intermediate QCD-like sector, linked through extended gauge symmetries. The Extended Most Attractive Channel hypothesis reveals a hierarchical structure of condensates, where channels with higher net chirality become increasingly attractive. At low energies, the Dark Technicolour paradigm naturally reduces to the Froggatt-Nielsen or Standard Hierarchical Vacuum Expectation Value model, governed by residual discrete symmetries, offering a compelling resolution to the Standard Model Flavor Problem.
Published: 2026-03-12 15:10:53
Authors: Song-Tao Liu, Tian-Yang Sun, Yu-Xin Wang, Yong-Xin Zhang, Shang-Jie Jin, Jing-Fei Zhang, Xin Zhang
Categories: gr-qc, astro-ph.CO, astro-ph.IM, hep-ph
Abstract:
Gravitational waves (GW) emitted by binary systems allow us to perform precision tests of general relativity in the strong field regime. Ringdown signals allow for probing black hole mass and spin with high precision in GW astronomy. With improvements in current and next-generation GW detectors, developing likelihood-free parameter inference methods is crucial. This is especially important when facing challenges such as non-standard noise, partial data, or incomplete signal models that prevent the use of analytical likelihood functions. In this work, we propose an amortized simulation-based inference strategy to estimate ringdown parameters directly. Specifically, our method is based on amortized neural posterior estimation, which trains a neural density estimator of the posterior for all data segments within the prior range. The results show that our trained amortized network achieves statistically consistent parameter estimates with valid confidence coverage compared to established Markov-chain methods, while offering inference speeds that are orders of magnitude faster. Furthermore, we evaluate the robustness of the method against transient noise contamination. Our analysis reveals that the timing of glitch injection has a decisive impact on estimation bias, particularly during the tail of a signal with sparse information. Glitch strength is positively correlated with estimation error, but has limited effect at low signal-to-noise ratios. Mass and spin parameters are most sensitive to noise. This study not only provides an efficient and accurate inference framework for ringdown analysis but also lays a foundation for developing robust data-processing pipelines for future GW astronomy in realistic noise environments.
Published: 2026-03-12 15:07:15
Authors: Hung Nguyen-Kha, Ti Ti Nguyen, Vu Nguyen Ha, Eva Lagunas, Symeon Chatzinotas, Bjorn Ottersten
Categories: eess.SP
Abstract:
In Earth observation (EO) missions with Low Earth orbit (LEO) satellites, high-resolution image acquisition generates a massive data volume that poses a significant challenge for transmission under the limited satellite power budget, while LEO movement introduces dynamic systems. To enable efficient image transmission, this paper employs semantic communication (SemCom) with joint source-channel coding (JSCC), which focuses on transmitting meaningful information to reduce power consumption. Under a quality-of-service (QoS) requirement defined by image reconstruction quality, this work aims to minimize the total transmit power by jointly optimizing the JSCC encoder-decoder parameters and resource allocation. However, the implicit relationship among JSCC parameters, link quality, and image quality, coupled with the presence of mixed integer-continuous variables, makes the problem difficult to solve directly. To address this, a curve-fitting model is proposed to approximate the JSCC compression-SNR-quality relationship. Then, the joint compression ratio-resource allocation (JCRRA) algorithm is proposed to address the underlying problem. Numerical results demonstrate that the proposed method achieves substantial power savings compared to both greedy algorithms and conventional transmission paradigms.
Published: 2026-03-12 15:05:03
Authors: Haotong Duan, Zhongming Chen, Ngai Wong
Categories: cs.LG
Abstract:
Tensor networks, which are originally developed for characterizing complex quantum many-body systems, have recently emerged as a powerful framework for capturing high-dimensional probability distributions with strong physical interpretability. This paper systematically studies matrix product states (MPS) for generative modeling and shows that unitary MPS, which is a tensor-network architecture that is both simple and expressive, offers clear benefits for unsupervised learning by reducing ambiguity in parameter updates and improving efficiency. To overcome the inefficiency of standard gradient-based MPS training, we develop a Riemannian optimization approach that casts probabilistic modeling as an optimization problem with manifold constraints, and further derive an efficient space-decoupling algorithm. Experiments on Bars-and-Stripes and EMNIST datasets demonstrate fast adaptation to data structure, stable updates, and strong performance while maintaining the efficiency and expressive power of MPS.
Published: 2026-03-12 15:04:53
Authors: S. Brendle
Categories: math.DG
Abstract:
In this expository paper, we discuss a unified framework for proving various geometric inequalities, based on the so-called Alexandrov-Bakelman-Pucci technique. Examples include Cabré's proof of the classical isoperimetric inequality in Euclidean space; the Fenchel-Willmore-Chen inequality for the mean curvature of a submanifold; the sharp version of the Michael-Simon Sobolev inequality for submanifolds; the sharp version of Ecker's logarithmic Sobolev inequality for submanifolds; and the Sobolev inequality for complete manifolds with nonnegative Ricci curvature and Euclidean volume growth. Finally, we discuss a connection to the work of Heintze and Karcher on the volume of a tubular neighborhood of a hypersurface in a manifold with nonnegative Ricci curvature.
Published: 2026-03-12 14:56:04
Authors: Haimiti Atila, Seymour M. J. Spence
Categories: cs.LG
Abstract:
Modeling high-dimensional, nonlinear dynamic structural systems under natural hazards presents formidable computational challenges, especially when simultaneously accounting for uncertainties in external loads and structural parameters. Studies have successfully incorporated uncertainties related to external loads from natural hazards, but few have simultaneously addressed loading and parameter uncertainties within structural systems while accounting for prediction uncertainty of neural networks. To address these gaps, three metamodeling frameworks were formulated, each coupling a feature-extraction module implemented through a multi-layer perceptron (MLP), a message-passing neural network (MPNN), or an autoencoder (AE) with a long short-term memory (LSTM) network using Monte Carlo dropout and a negative log-likelihood loss. The resulting architectures (MLP-LSTM, MPNN-LSTM, and AE-LSTM) were validated on two case studies: a multi-degree-of-freedom Bouc-Wen system and a 37-story fiber-discretized nonlinear steel moment-resisting frame, both subjected to stochastic seismic excitation and structural parameter uncertainty. All three approaches achieved low prediction errors: the MLP-LSTM yielded the most accurate results for the lower-dimensional Bouc-Wen system, whereas the MPNN-LSTM and AE-LSTM provided superior performance on the more complex steel-frame model. Moreover, a consistent correlation between predictive variance and actual error confirms the suitability of these frameworks for active-learning strategies and for assessing model confidence in structural response predictions.
Published: 2026-03-12 14:53:29
Authors: Cervane Grimaud, Denys Malyshev, Emmanuel Moulin
Categories: astro-ph.HE, astro-ph.CO, hep-ph
Abstract:
Axion-Like-Particles (ALPs) are hypothetical pseudo-scalar particles actively searched as light dark matter candidates. The coupling of ALPs to photons can give rise to distinctive spectral features in the observed gamma-ray spectrum of astrophysical sources.
We perform a forecast study on the sensitivity to ALP-photon interactions using stacked mock observations of selected active galactic nuclei (AGNs) located behind galaxy clusters (GC). The ALP-photon conversion in the magnetic fields of galaxy clusters give rise to absorption-like features in AGN spectra that are subject to large variance in their prediction for individual sources. We consider here a stacking analysis of multiple AGN-cluster pairs, which yields a more controlled prediction of the expected ALP-induced spectral patterns in the observed gamma-ray spectra.
Using realistic mock observations of selected Fermi-LAT AGNs by ongoing Imaging Atmospheric Cherenkov Telescopes such as H.E.S.S., MAGIC and VERITAS, we provide a careful assessment of the expected sensitivity of a combined statistical analysis of many AGN-GC pairs, together with the impact of modelling and instrumental uncertainties. The sensitivity reaches ALP-photon couplings down to 6$\times$10$^{-13}$ GeV$^{-1}$ for an ALP mass of 3$\times$10$^{-8}$ eV, and is currently statistically dominated indicating further improvements from more observations. Such a stacking analysis approach enables exploration of the yet-uncharted ALP dark matter parameter space in the 10$^{-8}$ - 10$^{-7}$ eV mass range.
Published: 2026-03-12 14:49:54
Authors: Hakob Avetisyan, Vahagn Abgaryan
Categories: quant-ph
Abstract:
We analyze propagation and detection of two-photon states expanded in Zernike modes through atmospheric turbulence using the extended Huygens-Fresnel formalism. For SPDC states prepared with a single Zernike pump mode, we analytically reduce the 8-dimensional continuous propagation integrals to an exact, discrete modal expansion. In the absence of turbulence, Zernike addition enforces conservation of azimuthal index and a strict radial-order bound. Turbulence relaxes these constraints, driving structured azimuthal and radial crosstalk dominated by low-order aberration modes. By explicitly removing the lowest-order terms from the discrete turbulence sum, we demonstrate that partial adaptive optics correcting only up to the sixth radial order is sufficient to heavily suppress this crosstalk and restore near-ideal spatial correlations.
Published: 2026-03-12 14:49:40
Authors: Eftychia Madika, Bia Boccardi, Luca Ricci, Paola Grandi, Eleonora Torresi, Gabriele Giovannini, Matthias Kadler, J. Anton Zensus
Categories: astro-ph.HE, astro-ph.GA
Abstract:
We present a comprehensive multifrequency VLBI analysis of the FRII, high-excitation radio galaxy 3C 452, aiming to resolve and analyze for the first time its twin-jet structure on sub-parsec scales. Our data set comprises High Sensitivity Array (HSA) observations at 4.9, 8.4, 15.4, 23.6, and 43.2 GHz. Through fitting methods performed in both the visibility and the image plane, we trace the jet expansion from scales of a few thousand to nearly $10^5$ Schwarzschild radii ($R_S$) on both the approaching and receding jets. Additionally, we derive the core brightness temperatures and Doppler factors to constrain the jet's orientation and intrinsic speed. Our study provides the first detailed description of the twin-jet system in 3C 452 on VLBI scales, confirming it as a rare FRII source with jets detected down to millimeter wavelengths. We resolve both jet and counter-jet down to scales of a few thousand $R_S$, revealing a symmetric, parabolically expanding structure with power-law indices $k \approx 0.66$ (jet) and $k \approx 0.47$ (counter-jet). The brightness temperature analysis yields low Doppler factors ($δ\sim 0.03$-$0.83$), indicative of Doppler de-boosting due to the large viewing angle ($θ\approx 70^\circ$) and/or a magnetically dominated jet base. A spectral index analysis reveals a strongly inverted core spectrum ($α> 2$) with additional absorption at the highest frequencies, followed by a sharp steepening ($α\sim -2.5$) to optically thin values in the innermost jet. Finally, a comparison between broad- and narrow-line high-excitation radio galaxies shows that jets in narrow-line sources such as 3C 452 and Cygnus A complete collimation at $\leq 10^5 R_S$, whereas broad-line sources exhibit shape transitions at $10^6$-$10^7 R_S$, suggesting that orientation plays an important role in the observed collimation scales.
Published: 2026-03-12 14:47:16
Authors: Shi Jin, Shuyi Zhang
Categories: quant-ph
Abstract:
This paper investigates quantum simulation algorithms for the Liouville equation in geometrical optics with partial transmission and reflection at sharp interfaces, based on the Schrödingerization method. By means of a warped phase transformation in one higher dimension, the Schrödingerization method converts linear partial differential equations into a system of Schrödinger-type equations with unitary evolution, thereby rendering them suitable for quantum simulation. In this work, the Schrödingerization method is combined with a Hamiltonian-preserving scheme that incorporates partial transmission and reflection into the numerical flux. A main difficulty is that the interface treatment in the classical scheme relies on threshold-dependent "if/else" procedures, making it highly nontrivial to reformulate the method in a matrix form suitable for quantum simulation. To overcome this difficulty, we encode the interface conditions into a partial transmission and reflection matrix prepared a priori, rather than during the time evolution. We present detailed constructions of the resulting quantum algorithms and show through complexity analysis that the proposed methods achieve polynomial quantum advantage in the precision parameter $ε$ over their classical counterparts.
Published: 2026-03-12 14:38:13
Authors: Qianpu Sun, Xiaowei Chi, Yuhan Rui, Ying Li, Kuangzhi Ge, Jiajun Li, Sirui Han, Shanghang Zhang
Categories: cs.AI
Abstract:
Artificial intelligence is increasingly catalyzing scientific automation, with multimodal large language model (MLLM) agents evolving from lab assistants into self-driving lab operators. This transition imposes stringent safety requirements on laboratory environments, where fragile glassware, hazardous substances, and high-precision laboratory equipment render planning errors or misinterpreted risks potentially irreversible. However, the safety awareness and decision-making reliability of embodied agents in such high-stakes settings remain insufficiently defined and evaluated. To bridge this gap, we introduce LABSHIELD, a realistic multi-view benchmark designed to assess MLLMs in hazard identification and safety-critical reasoning. Grounded in U.S. Occupational Safety and Health Administration (OSHA) standards and the Globally Harmonized System (GHS), LABSHIELD establishes a rigorous safety taxonomy spanning 164 operational tasks with diverse manipulation complexities and risk profiles. We evaluate 20 proprietary models, 9 open-source models, and 3 embodied models under a dual-track evaluation framework. Our results reveal a systematic gap between general-domain MCQ accuracy and Semi-open QA safety performance, with models exhibiting an average drop of 32.0% in professional laboratory scenarios, particularly in hazard interpretation and safety-aware planning. These findings underscore the urgent necessity for safety-centric reasoning frameworks to ensure reliable autonomous scientific experimentation in embodied laboratory contexts. The full dataset will be released soon.
Published: 2026-03-12 14:25:44
Authors: Jiayue Pu, Zhongxiang Sun, Zilu Zhang, Xiao Zhang, Jun Xu
Categories: cs.CV, cs.AI, cs.CR
Abstract:
The rapid evolution of embodied agents has accelerated the deployment of household robots in real-world environments. However, unlike structured industrial settings, household spaces introduce unpredictable safety risks, where system limitations such as perception latency and lack of common sense knowledge can lead to dangerous errors. Current safety evaluations, often restricted to static images, text, or general hazards, fail to adequately benchmark dynamic unsafe action detection in these specific contexts. To bridge this gap, we introduce \textbf{HomeSafe-Bench}, a challenging benchmark designed to evaluate Vision-Language Models (VLMs) on unsafe action detection in household scenarios. HomeSafe-Bench is contrusted via a hybrid pipeline combining physical simulation with advanced video generation and features 438 diverse cases across six functional areas with fine-grained multidimensional annotations. Beyond benchmarking, we propose \textbf{Hierarchical Dual-Brain Guard for Household Safety (HD-Guard)}, a hierarchical streaming architecture for real-time safety monitoring. HD-Guard coordinates a lightweight FastBrain for continuous high-frequency screening with an asynchronous large-scale SlowBrain for deep multimodal reasoning, effectively balancing inference efficiency with detection accuracy. Evaluations demonstrate that HD-Guard achieves a superior trade-off between latency and performance, while our analysis identifies critical bottlenecks in current VLM-based safety detection.
Published: 2026-03-12 14:20:29
Authors: Junhyeong Byeon, Jeongyeol Kim, Sejoon Lim
Categories: cs.CV, cs.AI
Abstract:
Emotion recognition in in-the-wild video data remains a challenging problem due to large variations in facial appearance, head pose, illumination, background noise, and the inherently dynamic nature of human affect. Relying on a single modality, such as facial expressions or speech, is often insufficient to capture these complex emotional cues. To address this issue, we propose a multimodal emotion recognition framework for the Expression (EXPR) Recognition task in the 10th Affective Behavior Analysis in-the-wild (ABAW) Challenge.
Our approach leverages large-scale pre-trained models, namely CLIP for visual encoding and Wav2Vec 2.0 for audio representation learning, as frozen backbone networks. To model temporal dependencies in facial expression sequences, we employ a Temporal Convolutional Network (TCN) over fixed-length video windows. In addition, we introduce a bi-directional cross-attention fusion module, in which visual and audio features interact symmetrically to enhance cross-modal contextualization and capture complementary emotional information. A lightweight classification head is then used for final emotion prediction. We further incorporate a text-guided contrastive objective based on CLIP text features to encourage semantically aligned visual representations.
Experimental results on the ABAW 10th EXPR benchmark show that the proposed framework provides a strong multimodal baseline and achieves improved performance over unimodal modeling. These results demonstrate the effectiveness of combining temporal visual modeling, audio representation learning, and cross-modal fusion for robust emotion recognition in unconstrained real-world environments.
Published: 2026-03-12 14:04:58
Authors: Pranav Raikote, Korbinian Randl, Ioanna Miliou, Athanasios Lakes, Panagiotis Papapetrou
Categories: cs.CL
Abstract:
Scaling educational assessment with large language models requires not just accuracy, but the ability to recognize when predictions are trustworthy. Instruction-tuned models tend to be overconfident, and their reliability deteriorates as curricula evolve, making fully autonomous deployment unsafe in high-stakes settings. We introduce CHiL(L)Grader, the first automated grading framework that incorporates calibrated confidence estimation into a human-in-the-loop workflow. Using post-hoc temperature scaling, confidence-based selective prediction, and continual learning, CHiL(L)Grader automates only high-confidence predictions while routing uncertain cases to human graders, and adapts to evolving rubrics and unseen questions. Across three short-answer grading datasets, CHiL(L)Grader automatically scores 35-65% of responses at expert-level quality (QWK >= 0.80). A QWK gap of 0.347 between accepted and rejected predictions confirms the effectiveness of the confidence-based routing. Each correction cycle strengthens the model's grading capability as it learns from teacher feedback. These results show that uncertainty quantification is key for reliable AI-assisted grading.
Published: 2026-03-12 13:56:42
Authors: Hao Yang, Minghan Wang, Tongtong Wu, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
Categories: cs.SD, cs.CL, cs.MM, eess.AS
Abstract:
Large Audio Language Models (LALMs) have expanded the interaction with human to speech modality, which introduces great interactive potential, due to the paralinguistic cues implicitly indicating the user context. However, building on the current content-centred paradigm, LALMs usually neglect such paralinguistic cues and respond solely based on query content. In this work, to resurface the paralinguistic awareness in LALMs, we introduce five diverse layer-wise analyses to jointly identify paralinguistic layers and semantic understanding layers. Based on these insights, we propose a paralinguistic-enhanced fine-tuning (PE-FT) protocol accordingly to equip LALMs with paralinguistic-aware capabilities, including (1) selective-layer fine-tuning, and (2) an auxiliary dual-level classification head. Our experiments demonstrate that PE-FT protocol efficiently and effectively resurfaces the paralinguistic awareness, even surpassing the performance of the all-layer fine-tuning strategy.
Published: 2026-03-12 13:53:20
Authors: Ihor Kendiukhov
Categories: cs.LG
Abstract:
Mechanistic interpretability of biological foundation models has relied on selective feature sampling, pairwise interaction testing, and observational trajectory analysis. Each of these can introduce systematic bias. Here we present three experiments that address these limitations through exhaustive circuit tracing, higher order combinatorial ablation, and causal trajectory steering in Geneformer, a transformer based single cell foundation model. First, exhaustive tracing of all 4065 active sparse autoencoder features at layer 5 yields 1393850 significant downstream edges, a 27 fold expansion over selective sampling. This reveals a heavy tailed hub distribution in which 1.8 percent of features account for disproportionate connectivity and 40 percent of the top 20 hubs lack biological annotation. These results indicate systematic annotation bias in prior selective analyses. Second, three way combinatorial ablation across 8 feature triplets shows that redundancy deepens monotonically with interaction order, with a three way ratio of 0.59 versus a pairwise ratio of 0.74, and with zero synergy. This confirms that the model architecture is subadditive at all tested orders. Third, trajectory guided feature steering establishes a causal link between layer position and differentiation directionality. Late layer features at L17 consistently push cell states toward maturity, with fraction positive equal to 1.0. Early and mid layer features at L0 and L11 mostly push away from maturity, with fraction positive ranging from 0.00 to 0.58. Together these results move from correlation toward causal evidence for layer dependent control of cell state.
Published: 2026-03-12 13:52:44
Authors: Kanishka Gunawardana, Sanka Peeris, Kavishka Rambukwella, Thamish Wanduragala, Saadia Jameel, Roshan Ragel, Isuru Nawinne
Categories: cs.AR, cs.NE
Abstract:
Spiking Neural Networks (SNNs) have gained significant attention in edge computing due to their low power consumption and computational efficiency. However, existing implementations either use conventional System on Chip (SoC) architectures that suffer from memory-processor bottlenecks, or large-scale neuromorphic hardware that is inefficient and wasteful for small-scale SNN applications. This work presents SNAP-V, a RISC-V-based neuromorphic SoC with two accelerator variants: Cerebra-S (bus-based) and Cerebra-H (Network-on-Chip (NoC)-based) which are optimized for small-scale SNN inference, integrating a RISC-V core for management tasks, with both accelerators featuring parallel processing nodes and distributed memory. Experimental results show close agreement between software and hardware inference, with an average accuracy deviation of 2.62% across multiple network configurations, and an average synaptic energy of 1.05 pJ per synaptic operation (SOP) in 45 nm CMOS technology. These results show that the proposed solution enables accurate, energy-efficient SNN inference suitable for real-time edge applications.
Published: 2026-03-12 13:48:29
Authors: Uttamasha Anjally Oyshi, Susan Gauch
Categories: cs.AI
Abstract:
Despite frequent double-blind review, demographic biases of authors still disadvantage the underrepresented groups. We present Fair-PaperRec, a MultiLayer Perceptron (MLP)-based model that addresses demographic disparities in post-review paper acceptance decisions while maintaining high-quality requirements. Our methodology penalizes demographic disparities while preserving quality through intersectional criteria (e.g., race, country) and a customized fairness loss, in contrast to heuristic approaches. Evaluations using conference data from ACM Special Interest Group on Computer-Human Interaction (SIGCHI), Designing Interactive Systems (DIS), and Intelligent User Interfaces (IUI) indicate a 42.03% increase in underrepresented group participation and a 3.16% improvement in overall utility, indicating that diversity promotion does not compromise academic rigor and supports equity-focused peer review solutions.
Published: 2026-03-12 13:44:50
Authors: Leo Stenzel, Roeland ter Hoeven, Ryoji Miyazaki, Tomohiro Yamaji, Masayuki Shirane, Wolfgang Lechner
Categories: quant-ph
Abstract:
Coherent states offer a promising path for near-term quantum computing due to their inherent protection against bit-flip noise. However, their large photon numbers can be challenging for numerical simulation. This paper introduces an effective model, representing coherent-state quantum annealing using spin-1/2 degrees of freedom. We demonstrate that this model yields accurate predictions for realistic experimental settings and can therefore serve as a practical tool for optimizing future quantum hardware.
Published: 2026-03-12 13:43:30
Authors: B. O. Kerbikov
Categories: hep-ph, astro-ph.HE
Abstract:
The neutron to mirror neutron transitions in neutron stars would possibly result in significant effects. In this work we show that collisional decoherence entails exponential relaxation in lieu of oscillations. Decoherence is a great many orders of magnitude faster than the expected oscillations. The admixture of mirror neutrons at all times remains very small with respect to ordinary neutrons component.
Published: 2026-03-12 13:38:48
Authors: Bo Chen
Categories: math.DG
Abstract:
This paper establishes decay estimates near isolated singularities for $n$-dimensional Yang-Mills-Higgs fields defined on a fiber bundle ($n \geq 4$). These estimates yield a removable singularity theorem for Yang-Mills-Higgs fields under conformally invariant energy bounds, extending the classical results for Yang-Mills fields and harmonic maps.
Published: 2026-03-12 13:31:53
Authors: Xinyang Li, Songjie Yang, Boyu Ning, Zongmiao He, Xiang Ling, Chau Yuen
Categories: eess.SP
Abstract:
Hybrid beamforming for extremely large-scale multiple-input multiple-output (XL-MIMO) systems is challenging in the near field because the channel depends jointly on angle and distance, and the multiuser interference (MUI) is strong. Existing deep learning methods typically follow either a decoupled design that optimizes analog beamforming without explicitly accounting for MUI, or an end-to-end (E2E) joint analog-digital optimization that can be unstable under nonconvex constant-modulus (CM), pronounced analog-digital coupling, and gradient pattern of sum-rate loss. To address both issues, we develop a complex-valued E2E framework based on a variant minimum mean square error (variant-MMSE) criterion, where the digital precoder is eliminated in closed form via Karush-Kuhn-Tucker (KKT) conditions so that analog learning is trained with a stable objective. The network employs a grouped complex-convolution sensing front-end for uplink (UL) measurements, a shared complex multi-layer perceptron (MLP) for per-user feature extraction, and a merged constant-modulus head to output the analog precoder. In the indirect mode, the network designs hybrid beamformers from estimated channel state information (CSI). In the direct mode where explicit CSI is unavailable, the network learns the sensing operator and the analog mapping from short pilots, after which additional pilots estimate the equivalent channel and enable a KKT closed-form digital precoder. Simulations show that the indirect mode approaches the performance of iterative variant-MMSE optimization with a complexity reduction proportional to the antenna number. In the direct mode, the proposed method improves spectral efficiency over sparse-recovery pipelines and recent deep learning baselines under the same pilot budget.
Published: 2026-03-12 13:31:43
Authors: Pietro Bonazzi, Nicola Farronato, Stefan Zihlmann, Haotong Qin, Michele Magno
Categories: cs.CV
Abstract:
Real-time, on-device segmentation is critical for latency-sensitive and privacy-aware applications such as smart glasses and Internet-of-Things devices. We introduce PicoSAM3, a lightweight promptable visual segmentation model optimized for edge and in-sensor execution, including deployment on the Sony IMX500 vision sensor. PicoSAM3 has 1.3 M parameters and combines a dense CNN architecture with region of interest prompt encoding, Efficient Channel Attention, and knowledge distillation from SAM2 and SAM3. On COCO and LVIS, PicoSAM3 achieves 65.45% and 64.01% mIoU, respectively, outperforming existing SAM-based and edge-oriented baselines at similar or lower complexity. The INT8 quantized model preserves accuracy with negligible degradation while enabling real-time in-sensor inference at 11.82 ms latency on the IMX500, fully complying with its memory and operator constraints. Ablation studies show that distillation from large SAM models yields up to +14.5% mIoU improvement over supervised training and demonstrate that high-quality, spatially flexible promptable segmentation is feasible directly at the sensor level.
Published: 2026-03-12 13:25:21
Authors: Rajdeep Pathak, Rahul Goswami, Madhurima Panja, Palash Ghosh, Tanujit Chakraborty
Categories: cs.LG, cs.AI, stat.ML
Abstract:
Reliable uncertainty quantification is critical in multivariate time series forecasting problems arising in domains such as energy systems and transportation networks, among many others. Although Transformer-based architectures have recently achieved strong performance for sequence modeling, most probabilistic forecasting approaches rely on restrictive parametric likelihoods or quantile-based objectives. They can struggle to capture complex joint predictive distributions across multiple correlated time series. This work proposes EnTransformer, a deep generative forecasting framework that integrates engression, a stochastic learning paradigm for modeling conditional distributions, with the expressive sequence modeling capabilities of Transformers. The proposed approach injects stochastic noise into the model representation and optimizes an energy-based scoring objective to directly learn the conditional predictive distribution without imposing parametric assumptions. This design enables EnTransformer to generate coherent multivariate forecast trajectories while preserving Transformers' capacity to effectively model long-range temporal dependencies and cross-series interactions. We evaluate our proposed EnTransformer on several widely used benchmarks for multivariate probabilistic forecasting, including Electricity, Traffic, Solar, Taxi, KDD-cup, and Wikipedia datasets. Experimental results demonstrate that EnTransformer produces well-calibrated probabilistic forecasts and consistently outperforms the benchmark models.
Published: 2026-03-12 13:13:50
Authors: Lu Wang, Zhuoran Jin, Yupu Hao, Yubo Chen, Kang Liu, Yulong Ao, Jun Zhao
Categories: cs.CV, cs.AI, cs.CL
Abstract:
Multimodal large language models (MLLMs) have shown strong performance on offline video understanding, but most are limited to offline inference or have weak online reasoning, making multi-turn interaction over continuously arriving video streams difficult. Existing streaming methods typically use an interleaved perception-generation paradigm, which prevents concurrent perception and generation and leads to early memory decay as streams grow, hurting long-range dependency modeling. We propose Think While Watching, a memory-anchored streaming video reasoning framework that preserves continuous segment-level memory during multi-turn interaction. We build a three-stage, multi-round chain-of-thought dataset and adopt a stage-matched training strategy, while enforcing strict causality through a segment-level streaming causal mask and streaming positional encoding. During inference, we introduce an efficient pipeline that overlaps watching and thinking and adaptively selects the best attention backend. Under both single-round and multi-round streaming input protocols, our method achieves strong results. Built on Qwen3-VL, it improves single-round accuracy by 2.6% on StreamingBench and by 3.79% on OVO-Bench. In the multi-round setting, it maintains performance while reducing output tokens by 56%. Code is available at: https://github.com/wl666hhh/Think_While_Watching/
Published: 2026-03-12 13:05:17
Authors: Flavio Tuteri, Sergio Chibbaro, Alexandros Alexakis
Categories: physics.flu-dyn
Abstract:
Classical shell models of turbulence do not display dual cascade - inverse of energy and direct of enstrophy - because they fail to reproduce the right thermal spectra. We propose here a multi-branch shell model, including a geometry hierarchically organized across scales, in order to overcome this limitation. For this model, we demonstrate numerically both the agreement of the thermal spectra with those of two-dimensional fluid equations and the emergence of a statistically stationary dual cascade. This construction also allows us to study local transfers and to investigate both self-similarity and non-Gaussianity.
Published: 2026-03-12 13:01:41
Authors: Ahmed Magbool, Vaibhav Kumar, Marco Di Renzo, Mark F. Flanagan
Categories: eess.SP
Abstract:
Following recent advances in flexible electronics and programmable metasurfaces, flexible intelligent metasurfaces (FIMs) have emerged as a promising enabling technology for next-generation wireless networks. A FIM is a morphable electromagnetic surface capable of dynamically adjusting its physical geometry to influence the radiation and propagation of electromagnetic waves. Unlike conventional rigid arrays, FIMs introduce an additional spatial degree of design freedom enabled by mechanical flexibility, which can enhance beamforming, spatial focusing, and adaptation to dynamic wireless environments. This added capability enables wireless systems to shape the propagation environment not only through electromagnetic tuning but also through controllable geometric reconfiguration. This article explores the potential of FIMs for next-generation wireless networks. We first introduce the main hardware architectures of FIMs and explain how they can be integrated into wireless communication systems. We then present representative application scenarios, highlighting the advantages of FIMs for future wireless networks and comparing them with other emerging flexible wireless technologies. To illustrate their potential impact, we present case studies comparing FIM-enabled architectures with conventional rigid-array systems, demonstrating the performance gains enabled by surface flexibility for both communication and sensing applications. Finally, we discuss key opportunities, practical challenges, and open research directions that must be addressed to fully realize the potential of FIM technology in future wireless communication systems.
Published: 2026-03-12 12:57:54
Authors: Bowoo Kang
Categories: math.CV, math.AP, math.DG
Abstract:
We show the existence of a bounded solution to the Cauchy problem for the complex Monge-Ampère flow on a compact Kähler manifold, with the right-hand side of the form $dt \wedge dμ$ where $dμ$ is dominated by a Monge-Ampère measure of a Hölder continuous quasi-plurisubharmonic function. We also prove that for a given semi-positive big from $θ$, the $t$-slice of the solution is locally Hölder continuous on $\rm{Amp(θ)}$ for all $t \in (0, T)$. Next, we prove a comparison principle when $dμ$ is dominated by a Monge-Ampère measure of a bounded quasi-plurisubharmonic function, which implies the uniqueness of the solution.
Published: 2026-03-12 12:46:22
Authors: Omar Coser
Categories: q-bio.GN, cs.AI
Abstract:
Translating single-cell RNA sequencing (scRNA-seq) data into mechanistic biological hypotheses remains a critical bottleneck, as agentic AI systems lack direct access to transcriptomic representations while expression foundation models remain opaque to natural language. Here we introduce ELISA (Embedding-Linked Interactive Single-cell Agent), an interpretable framework that unifies scGPT expression embeddings with BioBERT-based semantic retrieval and LLM-mediated interpretation for interactive single-cell discovery. An automatic query classifier routes inputs to gene marker scoring, semantic matching, or reciprocal rank fusion pipelines depending on whether the query is a gene signature, natural language concept, or mixture of both. Integrated analytical modules perform pathway activity scoringacross 60+ gene sets, ligand--receptor interaction prediction using 280+ curated pairs, condition-aware comparative analysis, and cell-type proportion estimation all operating directly on embedded data without access to the original count matrix. Benchmarked across six diverse scRNA-seq datasets spanning inflammatory lung disease, pediatric and adult cancers, organoid models, healthy tissue, and neurodevelopment, ELISA significantly outperforms CellWhisperer in cell type retrieval (combined permutation test, $p < 0.001$), with particularly large gains on gene-signature queries (Cohen's $d = 5.98$ for MRR). ELISA replicates published biological findings (mean composite score 0.90) with near-perfect pathway alignment and theme coverage (0.98 each), and generates candidate hypotheses through grounded LLM reasoning, bridging the gap between transcriptomic data exploration and biological discovery. Code available at: https://github.com/omaruno/ELISA-An-AI-Agent-for-Expression-Grounded-Discovery-in-Single-Cell-Genomics.git (If you use ELISA in your research, please cite this work).
Published: 2026-03-12 12:36:56
Authors: Zi-Han Wang, Lam Nguyen, Zhengyang Zhao, Mengyue Yang, Chengwei Qin, Yujiu Yang, Linyi Yang
Categories: cs.AI
Abstract:
The saturation of high-quality pre-training data has shifted research focus toward evolutionary systems capable of continuously generating novel artifacts, leading to the success of AlphaEvolve. However, the progress of such systems is hindered by the lack of rigorous, quantitative evaluation. To tackle this challenge, we introduce CreativeBench, a benchmark for evaluating machine creativity in code generation, grounded in a classical cognitive framework. Comprising two subsets -- CreativeBench-Combo and CreativeBench-Explore -- the benchmark targets combinatorial and exploratory creativity through an automated pipeline utilizing reverse engineering and self-play. By leveraging executable code, CreativeBench objectively distinguishes creativity from hallucination via a unified metric defined as the product of quality and novelty. Our analysis of state-of-the-art models reveals distinct behaviors: (1) scaling significantly improves combinatorial creativity but yields diminishing returns for exploration; (2) larger models exhibit ``convergence-by-scaling,'' becoming more correct but less divergent; and (3) reasoning capabilities primarily benefit constrained exploration rather than combination. Finally, we propose EvoRePE, a plug-and-play inference-time steering strategy that internalizes evolutionary search patterns to consistently enhance machine creativity.
Published: 2026-03-12 12:35:46
Authors: Ching-Yu Kao, Xinfeng Li, Shenyu Dai, Tianze Qiu, Pengcheng Zhou, Eric Hanchen Jiang, Philip Sperl
Categories: cs.CR, cs.AI
Abstract:
High-privilege LLM agents that autonomously process external documentation are increasingly trusted to automate tasks by reading and executing project instructions, yet they are granted terminal access, filesystem control, and outbound network connectivity with minimal security oversight. We identify and systematically measure a fundamental vulnerability in this trust model, which we term the \emph{Trusted Executor Dilemma}: agents execute documentation-embedded instructions, including adversarial ones, at high rates because they cannot distinguish malicious directives from legitimate setup guidance. This vulnerability is a structural consequence of the instruction-following design paradigm, not an implementation bug. To structure our measurement, we formalize a three-dimensional taxonomy covering linguistic disguise, structural obfuscation, and semantic abstraction, and construct \textbf{ReadSecBench}, a benchmark of 500 real-world README files enabling reproducible evaluation. Experiments on the commercially deployed computer-use agent show end-to-end exfiltration success rates up to 85\%, consistent across five programming languages and three injection positions. Cross-model evaluation on four LLM families in a simulation environment confirms that semantic compliance with injected instructions is consistent across model families. A 15-participant user study yields a 0\% detection rate across all participants, and evaluation of 12 rule-based and 6 LLM-based defenses shows neither category achieves reliable detection without unacceptable false-positive rates. Together, these results quantify a persistent \emph{Semantic-Safety Gap} between agents' functional compliance and their security awareness, establishing that documentation-embedded instruction injection is a persistent and currently unmitigated threat to high-privilege LLM agent deployments.
Published: 2026-03-12 12:28:06
Authors: Keita Kayano, Takayuki Nishio, Daiki Yoda, Yuta Hirai, Tomoko Adachi
Categories: cs.LG
Abstract:
We propose a WiFi Channel State Information (CSI) sensing framework for multi-station deployments that addresses two fundamental challenges in practical CSI sensing: station-wise feature missingness and limited labeled data. Feature missingness is commonly handled by resampling unevenly spaced CSI measurements or by reconstructing missing samples, while label scarcity is mitigated by data augmentation or self-supervised representation learning. However, these techniques are typically developed in isolation and do not jointly address long-term, structured station unavailability together with label scarcity. To bridge this gap, we explicitly incorporate station unavailability into both representation learning and downstream model training. Specifically, we adapt cross-modal self-supervised learning (CroSSL), a representation learning framework originally designed for time-series sensory data, to multi-station CSI sensing in order to learn representations that are inherently invariant to station-wise feature missingness from unlabeled data. Furthermore, we introduce Station-wise Masking Augmentation (SMA) during downstream model training, which exposes the model to realistic station unavailability patterns under limited labeled data. Our experiments show that neither missingness-invariant pre-training nor station-wise augmentation alone is sufficient; their combination is essential to achieve robust performance under both station-wise feature missingness and label scarcity. The proposed framework provides a practical and robust foundation for multi-station WiFi CSI sensing in real-world deployments.
Published: 2026-03-12 12:11:56
Authors: Muhammad Asad Ullah, Davi Brilhante, Luís Eduardo Partichelli Potrich, José Suárez-Varela, Paul Almasan, Charles Cleary, Vadim Kramar
Categories: cs.NI
Abstract:
Communication, Navigation, and Surveillance (CNS) is the backbone of the Air Traffic Management (ATM) and Unmanned Aircraft System (UAS) Traffic Management (UTM) systems, ensuring safe and efficient operations of modern and future aviation. Traditionally, the CNS is considered three independent systems: communications, navigation, and surveillance. The current CNS system is fragmented, with limited integration across its three domains. Integrated CNS (ICNS) is a contemporary concept implying that those systems are provisioned through the same technology stack. ICNS is envisioned to improve service quality, spectrum efficiency, communication capacity, navigation predictability, and surveillance capabilities. The 5G technology stack offers higher throughput, lower latency, and massive connectivity compared to many existing communication technologies. This paper presents our 5G ICNS vision and network architecture and discusses how 5G technology can support integrated CNS services using terrestrial and non-terrestrial networks. We also discuss key 5G radio access technologies for delivering integrated CNS services at low altitudes for Innovative Air Mobility (IAM) and Advanced Air Mobility (AAM) operations. Finally, we present relevant challenges and potential research directions for further studies.
Published: 2026-03-12 12:07:49
Authors: Yuming Bai, Rulin Tian, Yue Zhang, Tao Wang
Categories: cond-mat.mtrl-sci, physics.app-ph
Abstract:
Spin-orbit torque (SOT) enables efficient current-driven control of magnetization, offering a promising pathway toward low-power spintronic devices. However, the origin and propagation of both damping-like (DL) and field-like (FL) SOTs in complex multilayers remain unclear. Here, we investigate NiFe thickness-dependent SOT efficiencies in Ta/Pt/Co/Cu/NiFe/Cu/Capping multilayers (x = 15 nm; Capping = Pt, Al, and SiO2). By employing a spin rotation geometry, the perpendicularly magnetized Pt/Co/Cu stacks serve as a spin source introducing unconventional spin polarization orthogonal to the Oersted field, eliminating its contribution and enabling unambiguous separation of SOTs using planar Hall and polar MOKE measurements. To distinguish bulk and interfacial contributions, we introduce a sample-area-normalized moment m = mNiFe/S, accounting for thickness-dependent magnetization and eliminating uncertainties arising from nominal thickness scaling and magnetic dead layers. We find that DL-SOT follows nearly linear 1/m scaling, consistent with rapid spin absorption at the Cu/NiFe interface but exhibits finite beta_SOT when 1/m approaches zero in both Pt- and Al-capped samples, indicating additional interfacial spin-current contributions at Cu/Pt and Cu/Al interfaces. In contrast, SiO2-capped samples show negligible interfacial contributions. Furthermore, FL-SOT deviates markedly from 1/m scaling and exhibits a significantly longer spin dephasing length (about 1.7 nm) compared to DL-SOT, implying extended propagation across NiFe. Comparative capping-layer studies further corroborate this behavior through interface-dependent spin transport. Our findings clarify the origin and distinct propagation characteristics of DL and FL torques, providing guidelines for engineering interfacial spin-orbit functionalities in ultrathin metallic heterostructures.
Published: 2026-03-12 12:04:52
Authors: Hengzhi Li, Wanyue Xiao, Junho Jung, Hao Pan, Shubo Wang
Categories: physics.optics
Abstract:
Moving media break time-reversal symmetry and exhibit intriguing optical nonreciprocity. This nonreciprocity is usually weak due to the much lower moving speed of media relative to the speed of light. We demonstrate that strong optical nonreciprocity can emerge in a two-dimensional photonic crystal composed of spinning dielectric cylinders. The photonic crystal supports two types of chiral modes at the Brillouin zone center: hybridized multipole modes and symmetry-protected bound states in the continuum (BICs), both of which carry intrinsic spin angular momentum. For finite wavevectors near the zone center, the BICs transform into quasi-bound states in the continuum (QBICs). Under oblique incidence of circularly polarized plane waves, the photonic crystal exhibits nonreciprocal transmission and absorption that are significantly enhanced at the frequencies of these hybridized multipole modes and QBICs. Furthermore, the high quality factors of the QBICs enable sharp transitions in nonreciprocity. Our work uncovers strong chiral light-matter interactions in periodic moving structures, with potential applications in nonreciprocal light manipulation. The mechanism may also be generalized to other classical wave systems, such as phononic crystals.
Published: 2026-03-12 12:02:36
Authors: Ali Sadeghkhani, Brandon Bennett, Masoud Babaei, Arash Rabbani
Categories: cs.CV, cond-mat.mtrl-sci, physics.geo-ph
Abstract:
Digital reconstruction of porous materials has become increasingly critical for applications ranging from geological reservoir characterization to tissue engineering and electrochemical device design. While traditional methods such as micro-computed tomography and statistical reconstruction approaches have established foundations in this field, the emergence of deep learning techniques, particularly Generative Adversarial Networks (GANs), has revolutionized porous media reconstruction capabilities. This review systematically analyzes 96 peer-reviewed articles published from 2017 to early 2026, examining the evolution and applications of GAN-based approaches for porous material image reconstruction. We categorize GAN architectures into six distinct classes, namely Vanilla GANs, Multi-Scale GANs, Conditional GANs, Attention-Enhanced GANs, Style-based GANs, and Hybrid Architecture GANs. Our analysis reveals substantial progress including improvements in porosity accuracy (within 1% of original samples), permeability prediction (up to 79% reduction in mean relative errors), and achievable reconstruction volumes (from initial $64^3$ to current $2{,}200^3$ voxels). Despite these advances, persistent challenges remain in computational efficiency, memory constraints for large-scale reconstruction, and maintaining structural continuity in 2D-to-3D transformations. This systematic analysis provides a comprehensive framework for selecting appropriate GAN architectures based on specific application requirements.
Published: 2026-03-12 11:57:12
Authors: Justin R. Crepp, Caleb G. Abbott, James Smous, Matthew Engstrom, Brian Sands
Categories: physics.optics, astro-ph.IM
Abstract:
Path-length diversity methods may be used for adaptive optics (AO) systems to retrieve phase and amplitude information by measuring intensity across multiple planes. Observations that rely on free-space propagation, such as the nonlinear curvature wavefront sensor (WFS), have been shown to offer excellent sensitivity and robustness to scintillation. However, the default design results in a large opto-mechanical footprint due to unavoidable geometric-optics and wave-optics effects. Measurements recorded in a convergent beam would improve instrument compactness, while concentrating light into smaller detector regions of interest, improving signal-to-noise ratio and possibly wavefront reconstruction speed. In this paper, we study path-length diversity wavefront sensing using four planes of contemporaneous intensity measurements made in a convergent beam. We develop a physical optics propagation model and validate the model by performing wavefront reconstructions in both simulations and lab experiments. The manuscripts core contribution is a practical, intensity-domain, Fourier-transform-based recipe to use a conventional multi-plane Gerchberg-Saxton (or comparable) reconstruction pipeline with convergent-beam measurements, enabling a compact optical layout. We find that this approach offers practical benefits over an equivalent free-space wavefront sensor, in particular reducing size, weight, complexity and cost.
Published: 2026-03-12 11:54:29
Authors: Jiahao Li, Qingwang Zhang, Qiuyu Chen, Guozhan Qiu, Yunzhong Lou, Xiangdong Zhou
Categories: cs.CV
Abstract:
The field of Computer-Aided Design (CAD) generation has made significant progress in recent years. Existing methods typically fall into two separate categorie: parametric CAD modeling and direct boundary representation (B-Rep) synthesis. In modern feature-based CAD systems, parametric modeling and B-Rep are inherently intertwined, as advanced parametric operations (e.g., fillet and chamfer) require explicit selection of B-Rep geometric primitives, and the B-Rep itself is derived from parametric operations. Consequently, this paradigm gap remains a critical factor limiting AI-driven CAD modeling for complex industrial product design. This paper present FutureCAD, a novel text-to-CAD framework that leverages large language models (LLMs) and a B-Rep grounding transformer (BRepGround) for high-fidelity CAD generation. Our method generates executable CadQuery scripts, and introduces a text-based query mechanism that enables the LLM to specify geometric selections via natural language, which BRepGround then grounds to the target primitives. To train our framework, we construct a new dataset comprising real-world CAD models. For the LLM, we apply supervised fine-tuning (SFT) to establish fundamental CAD generation capabilities, followed by reinforcement learning (RL) to improve generalization. Experiments show that FutureCAD achieves state-of-the-art CAD generation performance.