publications
Weiqin Yang's publications, generated by jekyll-scholar.
2025
- NeurIPS 2024PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for RecommendationIn Advances in Neural Information Processing Systems, 2025
Softmax Loss (SL) is widely applied in recommender systems (RS) and has demonstrated effectiveness. This work analyzes SL from a pairwise perspective, revealing two significant limitations: 1) the relationship between SL and conventional ranking metrics like DCG is not sufficiently tight; 2) SL is highly sensitive to false negative instances. Our analysis indicates that these limitations are primarily due to the use of the exponential function. To address these issues, this work extends SL to a new family of loss functions, termed Pairwise Softmax Loss (PSL), which replaces the exponential function in SL with other appropriate activation functions. While the revision is minimal, we highlight three merits of PSL: 1) it serves as a tighter surrogate for DCG with suitable activation functions; 2) it better balances data contributions; and 3) it acts as a specific BPR loss enhanced by Distributionally Robust Optimization (DRO). We further validate the effectiveness and robustness of PSL through empirical experiments. The code is available at https://github.com/Tiny-Snow/IR-Benchmark.
2024
- Micro. Biotec.Exploring the secrets of marine microorganisms: Unveiling secondary metabolites through metagenomicsShaoyu Wang , Xinyan Li, Weiqin Yang, and Ranran Huang*Microbial Biotechnology, 2024
Abstract Marine microorganisms are increasingly recognized as primary producers of marine secondary metabolites, drawing growing research interest. Many of these organisms are unculturable, posing challenges for study. Metagenomic techniques enable research on these unculturable microorganisms, identifying various biosynthetic gene clusters (BGCs) related to marine microbial secondary metabolites, thereby unveiling their secrets. This review comprehensively analyses metagenomic methods used in discovering marine microbial secondary metabolites, highlighting tools commonly employed in BGC identification, and discussing the potential and challenges in this field. It emphasizes the key role of metagenomics in unveiling secondary metabolites, particularly in marine sponges and tunicates. The review also explores current limitations in studying these metabolites through metagenomics, noting how long-read sequencing technologies and the evolution of computational biology tools offer more possibilities for BGC discovery. Furthermore, the development of synthetic biology allows experimental validation of computationally identified BGCs, showcasing the vast potential of metagenomics in mining marine microbial secondary metabolites.
2023
- Front. Microbiol.EVMP: Enhancing machine learning models for synthetic promoter strength prediction by Extended Vision Mutant Priority frameworkWeiqin Yang†, Dexin Li†, and Ranran Huang*Frontiers in Microbiology, 2023
**Introduction**: In metabolic engineering and synthetic biology applications, promoters with appropriate strengths are critical. However, it is time-consuming and laborious to annotate promoter strength by experiments. Nowadays, constructing mutation-based synthetic promoter libraries that span multiple orders of magnitude of promoter strength is receiving increasing attention. A number of machine learning (ML) methods are applied to synthetic promoter strength prediction, but existing models are limited by the excessive proximity between synthetic promoters. **Methods**: In order to enhance ML models to better predict the synthetic promoter strength, we propose EVMP(Extended Vision Mutant Priority), a universal framework which utilize mutation information more effectively. In EVMP, synthetic promoters are equivalently transformed into base promoter and corresponding k-mer mutations, which are input into BaseEncoder and VarEncoder, respectively. EVMP also provides optional data augmentation, which generates multiple copies of the data by selecting different base promoters for the same synthetic promoter. **Results**: In Trc synthetic promoter library, EVMP was applied to multiple ML models and the model effect was enhanced to varying extents, up to 61.30% (MAE), while the SOTA(state-of-the-art) record was improved by 15.25% (MAE) and 4.03% (R2). Data augmentation based on multiple base promoters further improved the model performance by 17.95% (MAE) and 7.25% (R2) compared with non-EVMP SOTA record. **Discussion**: In further study, extended vision (or k-mer) is shown to be essential for EVMP. We also found that EVMP can alleviate the over-smoothing phenomenon, which may contributes to its effectiveness. Our work suggests that EVMP can highlight the mutation information of synthetic promoters and significantly improve the prediction accuracy of strength. The source code is publicly available on GitHub: https://github.com/Tiny-Snow/EVMP.