| 일 | 월 | 화 | 수 | 목 | 금 | 토 |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | |
| 7 | 8 | 9 | 10 | 11 | 12 | 13 |
| 14 | 15 | 16 | 17 | 18 | 19 | 20 |
| 21 | 22 | 23 | 24 | 25 | 26 | 27 |
| 28 | 29 | 30 |
- ML
- Observability
- hammerDB
- polars
- Prometheus
- 스프링부트
- Kubernetes
- XGBoost
- traceId
- Cloud Systems
- keda
- minreplica
- Database
- eks
- Failure Prediction
- spring boot
- HPA
- Kind
- Borg
- MSSQL
- karpenter
- 0 replica
- AIOpsLab
- rllib
- propogation
- virtualservice
- Multi Agent Systems
- Reinforcement Learning
- zset
- Kafka
- Today
- Total
목록Reinforcement Learning (2)
김태오
Once the live loop existed, I started tuning the control behavior instead of only tuning model metrics.This was where Optuna and Ray/RLlib entered the project. I did not want tuning to be invisible. If the reward weights changed, I wanted the dashboard to show the trials. If the RL policy was disabled for a fast run, I wanted the dashboard to say so. If a PPO bootstrap completed, I wanted the ch..
The six-layer orchestrator was the point where the project stopped being only a data/model pipeline and became an actual control-plane experiment.I built it because I was tired of looking at model scores in isolation. A risk score is useful only if something can consume it. A demand estimate is useful only if it can affect efficiency behavior. Queue pressure is useful only if admission control c..