GHGitHub @Email inLinkedIn WWantedly arXivarXiv nnote

AI research / machine learning engineering

AI研究 / 機械学習エンジニアリング

野口翔伍Shogo Noguchi

M1 修士1年 Master’s 1st Year 東京大学大学院 The University of Tokyo / GSII 学際情報学府 Kamijo Lab / UTokyo IIS 上條研究室 / 東京大学生産技術研究所

M1 master's student in Kamijo Laboratory at The University of Tokyo, Graduate School of Interdisciplinary Information Studies. Interested in applied AI across domains, with a particular focus on learning and adapting foundation models. From Feb 2025 to Mar 2026, I worked as a Research Assistant at Sony CSL on music identification from brain-wave (EEG) signals. I also worked on road-structure-preserving scene generation for autonomous-driving data augmentation.

東京大学大学院学際情報学府学際情報学専攻先端表現情報学コース修士課程のM1として、上條研究室に所属しています。幅広いAI応用領域に関心があり、特に基盤モデルの学習・適応に強い関心があります。2025年2月から2026年3月まで、Sony CSLでResearch Assistantとして、音楽聴取中の脳波（EEG）から楽曲を識別する研究に従事しました。卒業研究では、自動運転向けデータ拡張のため、道路構造を保ったまま天候・時間帯を変えるシーン生成に取り組みました。

Seeking new-graduate roles in AI research or machine learning engineering.

新卒採用枠で、AI・機械学習エンジニア/研究職を志望しています。

Documents

応募資料

Downloads

ダウンロード

Resume is concise. CV contains full research details.

Resumeは簡潔版、CVは研究詳細版です。

Resume EN updated 2026/05/14 CV EN updated 2026/05/14 Resume JA updated 2026/05/14 CV JA updated 2026/05/14

Foundation-model learning基盤モデルの学習 Representation learning表現学習 Multimodal AIマルチモーダルAI Generative AI生成AI Brain signals / signal processing脳波・信号処理 Autonomous-driving AI自動運転AI

Current Kamijo Laboratory 上條研究室 The University of Tokyo
GSII / UTokyo IIS 東京大学大学院
学際情報学府 / 東京大学生産技術研究所 M1 / Master’s 1st Year 修士1年 / M1 Lab website 研究室Webサイト

Languages 言語 Nationality: Japanese
Native language: Japanese
English: business & research communication 国籍: 日本
母語: 日本語
英語: ビジネス・研究コミュニケーション

PreviousSony CSL Research Assistant
Feb 2025 – Mar 2026

EducationGunma University, B.Eng.
GPA 4.16 / 4.30

News

Latest updates

Selected AI project outcomes.

主なAIプロジェクト成果。

Conceptual diagram showing cortical representation, EEG recording, and EEG recognition modeling — Story 1. Cortical activity encodes acoustic and expectation-related information. ANN representations are used as teacher signals for EEG modeling. Story 1. 皮質活動は音響情報と期待関連情報を符号化します。この研究では、ANN表現をEEGモデリングの教師信号として使います。

Diagram showing acoustic and expectation-related ANN representations computed from auditory stimuli — Story 2. Acoustic-ANN and Expectation-ANN representations are computed from raw auditory stimuli and separated as different learning targets. Story 2. 生音からAcoustic-ANN表現とExpectation-ANN表現を計算し、異なる学習ターゲットとして分けて扱います。

Neural network architecture of the EEG encoder-decoder model with masked prediction and song classification — Story 3. The EEG encoder-decoder is pretrained to predict music-ANN teacher representations and then evaluated on downstream song identification. Story 3. EEGエンコーダ・デコーダをMusic-ANN教師表現の予測で事前学習し、その後の楽曲識別性能で評価します。

Sony CSL / Tokyo, Japan / Research Assistant

Sony Computer Science Laboratories, Inc.

Research Assistant under former Associate Researcher Natalia Polouliakh and former Project Researcher Taketo Akama.

Natalia Polouliakh氏（元Associate Researcher）とTaketo Akama氏（元Project Researcher）のもとで、Research Assistantとして参画。

Natalia Polouliakh Taketo Akama Related Sony CSL work: Deep12

Multimodal foundation-model research for EEG and music representation learning

EEGと音楽表現学習のためのマルチモーダル基盤モデル研究

This project studies how EEG recognition can be improved by pretraining with teacher representations derived from music foundation models. The key idea is to distinguish acoustic representations from expectation-related representations and use both to guide EEG modeling.

この研究では、音楽基盤モデルから得られる教師表現を用いた事前学習によって、EEG認識をどう改善できるかを扱います。中核となる考え方は、音響表現と期待関連表現を分けて扱い、その両方でEEGモデリングを導くことです。

Role

First author; learning-target design, PyTorch training/evaluation, writing, and public release.

第一著者。学習目標となる特徴量の設計、PyTorchでの学習・評価、論文作成、研究紹介ページ・コード公開を担当。

Evidence

0.859 best single model / 0.887 three-model ensemble.

Tech

PyTorch, Transformer encoder for EEG signals, masked prediction, MuQ, MusicGen, NMED-T.

arXiv Paper WEB Project site GH Code HF Model weights

Architecture diagram for conflict suppression in multi-condition diffusion models — Story 1. The core module suppresses conflicts among semantic segmentation, depth, and edge conditions by selecting effective conditions at each location. Story 1. 条件競合を抑える中核機構。セマンティックセグメンテーション・深度・エッジのうち、位置ごとに有効な条件を選択します。

Generation pipeline combining weather and time control with structural conditions — Story 2. The generation pipeline combines weather/time control with structural conditions to generate driving scenes while preserving road geometry. Story 2. 生成パイプライン。天候・時間帯制御と構造条件を組み合わせ、道路構造を保ったシーンを生成します。

Evaluation framework projecting generated and original images into a shared evaluation space — Story 3. The evaluation framework projects generated and original images into a shared evaluation space for multi-aspect comparison. Story 3. 評価フレームワーク。生成画像と元画像を共通の評価空間へ射影し、多面的に比較します。

Gunma University / Bachelor thesis project

Gunma University / Yuminaka Laboratory

Bachelor thesis project in the Electronics and Informatics Program.

群馬大学電子情報通信プログラム・弓仲研究室での卒業研究。

Yuminaka Laboratory page

Conflict-suppressed multi-condition diffusion for autonomous-driving data augmentation

自動運転データ拡張のための競合抑制型マルチ条件拡散モデル

This project generates rare driving-scene variations while preserving road layout, objects, depth, and edges. The core idea is to use multiple structural conditions and suppress conflicts among them so that generated data remains useful for high-level driving tasks.

この研究では、道路構造・物体・奥行き・輪郭を保ったまま、夜間や雨天などのシーンを生成します。中核となる考え方は、複数の構造条件を使いながら条件同士の競合を抑え、高レベル運転タスクにも使えるデータ拡張を行うことです。

Role

Bachelor thesis; model design, evaluation pipeline, and model-weight release.

卒業研究。モデル設計、評価パイプライン構築、モデル重みの公開を担当。

Evidence

Depth RMSE 33.02 → 27.77 / Object F1 0.0889 → 0.1071.

Tech

Diffusion models, ControlNet, Stable Diffusion, Metric3D, Qwen3-VL, Waymo.

arXiv Paper WEB Project site GH Code HF Model weights

Papers

Papers and manuscripts.

論文・原稿。

arXiv preprint / first author

Music Identification from Brain-Wave Signals via Acoustic and Expectation-Related ANN Representations

Shogo Noguchi, Taketo Akama, Tai Nakamura, Shun Minamikawa, Natalia Polouliakh

Abstract

During music listening, cortical activity encodes both acoustic and expectation-related information. Prior work has shown that ANN representations resemble cortical representations and can serve as supervisory signals for EEG recognition. Here we show that distinguishing acoustic and expectation-related ANN representations as teacher targets improves EEG-based music identification. Models pretrained to predict either representation outperform non-pretrained baselines, and combining them yields complementary gains that exceed strong seed ensembles formed by varying random initializations. These findings show that teacher representation type shapes downstream performance and that representation learning can be guided by neural encoding. This work points toward advances in predictive music cognition and neural decoding. Our expectation representation, computed directly from raw signals without manual labels, reflects predictive structure beyond onset or pitch, enabling investigation of multilayer predictive encoding across diverse stimuli. Its scalability to large, diverse datasets further suggests potential for developing general-purpose EEG models grounded in cortical encoding principles.

arXiv Paper WEB Project site GH Code

arXiv preprint / first author

AtteConDA: Attention-Based Conflict Suppression in Multi-Condition Diffusion Models and Synthetic Data Augmentation

Shogo Noguchi

Abstract

Recent conditional image generation methods can improve controllability by generating images that are faithful to conditions such as sketches, human poses, segmentation maps, and depth. By applying these techniques to image augmentation while preserving annotations, generated images can be used as additional training data and can improve recognition performance. However, for high-level driving tasks such as traffic-rule extraction and driving-behavior understanding, simply using annotations as conditions is insufficient. Instead, images must be augmented while preserving the detailed high-level structure of the original scene. One possible solution is to use multiple conditions so that generated images retain diverse structural cues after generation. However, when multiple conditions are used, conflicts among conditions can prevent reliable structure preservation. In this work, we input semantic segmentation, depth, and edges extracted from the original image into a multi-condition image generation model, thereby providing rich structural information as conditions. We further propose a modeling approach for handling conflicts among multiple conditions and show that it enables image generation with stronger structural preservation. We also build a generation framework and evaluation protocol for driving tasks, establishing a basis for comparison with prior and future models. As a result, this work contributes to image generation research by addressing condition conflicts in multi-condition generation and provides an important step toward mitigating data scarcity in high-level autonomous-driving tasks.

arXiv Paper WEB Project site GH Code

Achievements

Awards, grades, and credentials.

受賞・成績・資格。

AwardJSME Hatakeyama Award

The Japan Society of Mechanical Engineers, Mar 2026.

日本機械学会畠山賞、2026年3月。

RepresentativeGraduation representative

Electronics and Informatics Program, Gunma University, Mar 2026.

群馬大学電子情報通信プログラム卒業時総代、2026年3月。

GPA4.16 / 4.30

Final undergraduate GPA at Gunma University.

群馬大学学部時代の最終GPA。

TuitionFull tuition exemption

President-certified full tuition exemption, Gunma University, Oct 2025.

群馬大学学長認定による後期授業料全額免除、2025年10月。

EnglishTOEIC L&R 895

Obtained Apr 2024.

2024年4月取得。

EnglishTOEFL iBT 74

Obtained Mar 2024.

2024年3月取得。

Experience / education

Work and education timeline.

職歴・学歴のタイムライン。

2026.04 – PresentEducation

The University of Tokyo

東京大学大学院

The Graduate School of Interdisciplinary Information Studies, Emerging design and informatics course; Kamijo Laboratory. Early-stage master’s direction: camera-based HD-map updating and geometric traffic-sign/lane correspondence.

学際情報学府学際情報学専攻先端表現情報学コース、上條研究室。修士研究の初期テーマとして、車載カメラ画像を用いたHDマップ情報更新と、交通標識・レーンの対応関係推定を検討。

2025.02 – 2026.03Work

Sony Computer Science Laboratories, Inc.

Research Assistant, Mind Music Project / Research Activation Group. Led a first-author project on music identification from brain-wave (EEG) signals and participated in English/Japanese technical discussions with multinational members.

Research Assistant、Mind Music Project / Research Activation Group。音楽聴取中の脳波（EEG）から楽曲を識別する研究を第一著者として担当し、多国籍メンバーとの英語・日本語での技術議論に参加。

2022.04 – 2026.03Education

Gunma University

群馬大学

B.Eng., School of Science and Technology, Electronics and Informatics Program; GPA 4.16/4.30; graduation representative; bachelor thesis on structure-preserving autonomous-driving scene generation.

理工学部電子・機械類電子情報通信プログラム。GPA 4.16/4.30、卒業時総代。卒業研究では、自動運転向けの構造保持型シーン生成に取り組みました。

2019.04 – 2022.03Education

Kanagawa Prefectural Chigasaki Hokuryo High School

神奈川県立茅ケ崎北陵高等学校

High school education before entering Gunma University.

群馬大学入学前の高校課程。

野口 翔伍Shogo Noguchi

Downloads

ダウンロード

Latest updates

最新情報

Selected AI project outcomes.

主なAIプロジェクト成果。

Multimodal foundation-model research for EEG and music representation learning

EEGと音楽表現学習のためのマルチモーダル基盤モデル研究

Conflict-suppressed multi-condition diffusion for autonomous-driving data augmentation

自動運転データ拡張のための競合抑制型マルチ条件拡散モデル

Papers and manuscripts.

論文・原稿。

Music Identification from Brain-Wave Signals via Acoustic and Expectation-Related ANN Representations

AtteConDA: Attention-Based Conflict Suppression in Multi-Condition Diffusion Models and Synthetic Data Augmentation

Awards, grades, and credentials.

受賞・成績・資格。

Work and education timeline.

職歴・学歴のタイムライン。

The University of Tokyo

東京大学大学院

Sony Computer Science Laboratories, Inc.

Gunma University

群馬大学

Kanagawa Prefectural Chigasaki Hokuryo High School

神奈川県立茅ケ崎北陵高等学校

野口翔伍Shogo Noguchi