| Dataset | Method | Tmux | |
|---|---|---|---|
Main Pre-Search 命令:
--calibration_tolerance 0.05
--calibration_upper_bound 0.4
--num_few_shot_samples 45
# aflow
python run.py --dataset HotpotQA --aflow
# calibrated_prediction
python run.py --dataset HotpotQA --eval_metric_key calibrated_prediction --few_shot_data_calibration --calibration_tolerance 0.05 --calibration_upper_bound 0.4 --num_few_shot_samples 45
# few_shot_score
python run.py --dataset HotpotQA --eval_metric_key few_shot_score --few_shot_data_calibration --calibration_tolerance 0.05 --calibration_upper_bound 0.4 --num_few_shot_samples 45
# prediction
python run.py --dataset HotpotQA --eval_metric_key prediction
# self-confidence
python run.py --dataset HotpotQA --eval_metric_key self_confidence
AgentPrune 命令
python experiments/run_gsm8k.py --agent_nums 5 --mode FullConnected --batch_size 50 --num_iterations 2 --imp_per_iterations 1 --pruning_rate 0.2 --num_rounds 2 --llm_name qwen-plus --optimized_spatial --optimized_temporal