others-how to solve rasa train error?
1. Purpose
In this post, I will show you how to solve the following error when using rasa to train a chatbot:
[root@local rasa-test-0608]# ./train.sh
/opt/venv/lib/python3.10/site-packages/rasa/core/tracker_store.py:1048: MovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings. Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
Base: DeclarativeMeta = declarative_base()
/opt/venv/lib/python3.10/site-packages/pkg_resources/__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API
warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)
/opt/venv/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/opt/venv/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/opt/venv/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('ruamel')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/opt/venv/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('ruamel.yaml')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/opt/venv/lib/python3.10/site-packages/tensorflow/python/framework/dtypes.py:246: DeprecationWarning: `np.bool8` is a deprecated alias for `np.bool_`. (Deprecated NumPy 1.24)
np.bool8: (False, True),
The configuration for policies was chosen automatically. It was written into the config file at 'config.yml'.
/opt/venv/lib/python3.10/site-packages/jieba/__init__.py:44: DeprecationWarning: invalid escape sequence '\.'
re_han_default = re.compile("([\u4E00-\u9FD5a-zA-Z0-9+#&\._%\-]+)", re.U)
/opt/venv/lib/python3.10/site-packages/jieba/__init__.py:46: DeprecationWarning: invalid escape sequence '\s'
re_skip_default = re.compile("(\r\n|\s)", re.U)
2023-06-14 06:00:53 INFO rasa.engine.training.hooks - Restored component 'JiebaTokenizer' from cache.
Building prefix dict from the default dictionary ...
Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.658 seconds.
Prefix dict has been built successfully.
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "/opt/venv/lib/python3.10/site-packages/urllib3/connection.py", line 419, in connect
self.sock = ssl_wrap_socket(
File "/opt/venv/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
File "/opt/venv/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "/usr/lib/python3.10/ssl.py", line 513, in wrap_socket
return self.sslsocket_class._create(
File "/usr/lib/python3.10/ssl.py", line 1071, in _create
self.do_handshake()
File "/usr/lib/python3.10/ssl.py", line 1342, in do_handshake
self._sslobj.do_handshake()
ConnectionResetError: [Errno 104] Connection reset by peer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "/opt/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 550, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/opt/venv/lib/python3.10/site-packages/urllib3/packages/six.py", line 769, in reraise
raise value.with_traceback(tb)
File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "/opt/venv/lib/python3.10/site-packages/urllib3/connection.py", line 419, in connect
self.sock = ssl_wrap_socket(
File "/opt/venv/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
File "/opt/venv/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "/usr/lib/python3.10/ssl.py", line 513, in wrap_socket
return self.sslsocket_class._create(
File "/usr/lib/python3.10/ssl.py", line 1071, in _create
self.do_handshake()
File "/usr/lib/python3.10/ssl.py", line 1342, in do_handshake
self._sslobj.do_handshake()
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/rasa/engine/graph.py", line 394, in _load_component
self._component: GraphComponent = constructor( # type: ignore[no-redef]
File "/opt/venv/lib/python3.10/site-packages/rasa/engine/graph.py", line 221, in load
return cls.create(config, model_storage, resource, execution_context)
File "/opt/venv/lib/python3.10/site-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py", line 100, in create
return cls(config, execution_context)
File "/opt/venv/lib/python3.10/site-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py", line 67, in __init__
self._load_model_instance()
File "/opt/venv/lib/python3.10/site-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py", line 152, in _load_model_instance
self.tokenizer = model_tokenizer_dict[self.model_name].from_pretrained(
File "/opt/venv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1763, in from_pretrained
resolved_vocab_files[file_id] = cached_file(
File "/opt/venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 409, in cached_file
resolved_file = hf_hub_download(
File "/opt/venv/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/opt/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1326, in hf_hub_download
http_get(
File "/opt/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 505, in http_get
r = _request_wrapper(
File "/opt/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 442, in _request_wrapper
return http_backoff(
File "/opt/venv/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 129, in http_backoff
response = requests.request(method=method, url=url, **kwargs)
File "/opt/venv/lib/python3.10/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/opt/venv/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "/opt/venv/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "/opt/venv/lib/python3.10/site-packages/requests/adapters.py", line 547, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/venv/bin/rasa", line 8, in <module>
sys.exit(main())
File "/opt/venv/lib/python3.10/site-packages/rasa/__main__.py", line 127, in main
cmdline_arguments.func(cmdline_arguments)
File "/opt/venv/lib/python3.10/site-packages/rasa/cli/train.py", line 56, in <lambda>
train_parser.set_defaults(func=lambda args: run_training(args, can_exit=True))
File "/opt/venv/lib/python3.10/site-packages/rasa/cli/train.py", line 87, in run_training
training_result = train_all(
File "/opt/venv/lib/python3.10/site-packages/rasa/api.py", line 105, in train
return train(
File "/opt/venv/lib/python3.10/site-packages/rasa/model_training.py", line 207, in train
return _train_graph(
File "/opt/venv/lib/python3.10/site-packages/rasa/model_training.py", line 286, in _train_graph
trainer.train(
File "/opt/venv/lib/python3.10/site-packages/rasa/engine/training/graph_trainer.py", line 105, in train
graph_runner.run(inputs={PLACEHOLDER_IMPORTER: importer})
File "/opt/venv/lib/python3.10/site-packages/rasa/engine/runner/dask.py", line 101, in run
dask_result = dask.get(run_graph, run_targets)
File "/opt/venv/lib/python3.10/site-packages/dask/local.py", line 557, in get_sync
return get_async(
File "/opt/venv/lib/python3.10/site-packages/dask/local.py", line 500, in get_async
for key, res_info, failed in queue_get(queue).result():
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/opt/venv/lib/python3.10/site-packages/dask/local.py", line 542, in submit
fut.set_result(fn(*args, **kwargs))
File "/opt/venv/lib/python3.10/site-packages/dask/local.py", line 238, in batch_execute_tasks
return [execute_task(*a) for a in it]
File "/opt/venv/lib/python3.10/site-packages/dask/local.py", line 238, in <listcomp>
return [execute_task(*a) for a in it]
File "/opt/venv/lib/python3.10/site-packages/dask/local.py", line 229, in execute_task
result = pack_exception(e, dumps)
File "/opt/venv/lib/python3.10/site-packages/dask/local.py", line 224, in execute_task
result = _execute_task(task, data)
File "/opt/venv/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/opt/venv/lib/python3.10/site-packages/rasa/engine/graph.py", line 474, in __call__
self._load_component(**constructor_kwargs)
File "/opt/venv/lib/python3.10/site-packages/rasa/engine/graph.py", line 407, in _load_component
raise GraphComponentException(
rasa.engine.exceptions.GraphComponentException: Error initializing graph component for node run_LanguageModelFeaturizer1.
The startup command:
docker run --user 0 --network host -it -v $(pwd):/app rasa/rasa:3.5.10-full train
The config.yml in rasa bot:
# https://rasa.com/docs/rasa/model-configuration/
recipe: default.v1
# The assistant project unique identifier
# This default value must be replaced with a unique assistant name within your deployment
assistant_id: test_assistant
# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: zh
pipeline:
- name: JiebaTokenizer
dictionary_path: "pipline/jieba_userdict"
- name: LanguageModelFeaturizer
model_name: "bert"
model_weights: "bert-base-chinese"
2. Solution
Copy the language mode from local laptop to the server’s cache dir:
Then in ~/.cache/huggingface/hub
, you will get this :
[root@local hub]# tree models--bert-base-chinese
models--bert-base-chinese
├── blobs
│ ├── 612acd33db45677c3d6ba70615336619dc65cddf1ecf9d39a22dd1934af4aff2
│ ├── a521dc2845bdddbe822864290c6b928396fc5ee8
│ ├── ca4f9781030019ab9b253c6dcb8c7878b6dc87a5
│ └── e3c6d456fb2616f01a9a6cd01a1be1a36353ed22
├── refs
│ └── main
└── snapshots
└── 8d2a91f91cc38c96bb8b4556ba70c392f8d5ee55
├── config.json -> ../../blobs/a521dc2845bdddbe822864290c6b928396fc5ee8
├── tf_model.h5 -> ../../blobs/612acd33db45677c3d6ba70615336619dc65cddf1ecf9d39a22dd1934af4aff2
├── tokenizer_config.json -> ../../blobs/e3c6d456fb2616f01a9a6cd01a1be1a36353ed22
└── vocab.txt -> ../../blobs/ca4f9781030019ab9b253c6dcb8c7878b6dc87a5
4 directories, 9 files
Then run the train again :
Downloading (…)lve/main/config.json: 100%|████████████████████████████████████████████████████████████████████| 624/624 [00:00<00:00, 433kB/s]
Downloading tf_model.h5: 100%|█████████████████████████████████████████████████████████████████████████████| 478M/478M [02:39<00:00, 2.99MB/s]
Some layers from the model checkpoint at bert-base-chinese were not used when initializing TFBertModel: ['mlm___cls', 'nsp___cls']
- This IS expected if you are initializing TFBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFBertModel were initialized from the model checkpoint at bert-base-chinese.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.
2023-06-14 06:11:16 INFO rasa.engine.training.hooks - Starting to train component 'RegexFeaturizer'.
2023-06-14 06:11:16 INFO rasa.engine.training.hooks - Finished training component 'RegexFeaturizer'.
2023-06-14 06:11:16 INFO rasa.engine.training.hooks - Starting to train component 'DIETClassifier'.
Epochs: 0%| | 0/300 [00:00<?, ?it/s]
Epochs: 100%|████████████████████████████████████████████████████████████████| 300/300 [01:38<00:00, 3.05it/s, t_loss=0.376, i_acc=1, e_f1=1]
2023-06-14 06:12:54 INFO rasa.engine.training.hooks - Finished training component 'DIETClassifier'.
2023-06-14 06:12:54 INFO rasa.engine.training.hooks - Starting to train component 'EntitySynonymMapper'.
2023-06-14 06:12:54 INFO rasa.engine.training.hooks - Finished training component 'EntitySynonymMapper'.
2023-06-14 06:12:54 INFO rasa.engine.training.hooks - Starting to train component 'ResponseSelector'.
/opt/venv/lib/python3.10/site-packages/rasa/utils/train_utils.py:528: UserWarning: constrain_similarities is set to `False`. It is recommended to set it to `True` when using cross-entropy loss.
rasa.shared.utils.io.raise_warning(
2023-06-14 06:12:55 INFO rasa.nlu.selectors.response_selector - Retrieval intent parameter was left to its default value. This response selector will be trained on training examples combining all retrieval intents.
2023-06-14 06:12:55 INFO rasa.engine.training.hooks - Finished training component 'ResponseSelector'.
Processed rules: 100%|██████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 1462.52it/s, # trackers=1]
2023-06-14 06:12:55 INFO rasa.engine.training.hooks - Starting to train component 'MemoizationPolicy'.
Processed trackers: 0it [00:00, ?it/s]
2023-06-14 06:12:55 INFO rasa.engine.training.hooks - Finished training component 'MemoizationPolicy'.
2023-06-14 06:12:55 INFO rasa.engine.training.hooks - Starting to train component 'RulePolicy'.
Processed trackers: 100%|█████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 959.73it/s, # action=19]
Processed actions: 19it [00:00, 10969.27it/s, # examples=17]
Processed trackers: 0it [00:00, ?it/s]
Processed trackers: 100%|██████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 358.25it/s]
Processed trackers: 100%|█████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 1905.88it/s]
2023-06-14 06:12:55 INFO rasa.engine.training.hooks - Finished training component 'RulePolicy'.
2023-06-14 06:12:55 INFO rasa.engine.training.hooks - Starting to train component 'TEDPolicy'.
Processed trackers: 0it [00:00, ?it/s]
/opt/venv/lib/python3.10/site-packages/rasa/core/policies/ted_policy.py:723: UserWarning: Skipping training of `TEDPolicy` as no data was provided. You can exclude this policy in the configuration file to avoid this warning.
rasa.shared.utils.io.raise_warning(
2023-06-14 06:12:55 INFO rasa.engine.training.hooks - Finished training component 'TEDPolicy'.
2023-06-14 06:12:55 INFO rasa.engine.training.hooks - Starting to train component 'UnexpecTEDIntentPolicy'.
2023-06-14 06:12:55 WARNING rasa.shared.utils.common - The UnexpecTED Intent Policy is currently experimental and might change or be removed in the future 🔬 Please share your feedback on it in the forum (https://forum.rasa.com) to help us make this feature ready for production.
Processed trackers: 0it [00:00, ?it/s]
/opt/venv/lib/python3.10/site-packages/rasa/core/policies/ted_policy.py:723: UserWarning: Skipping training of `UnexpecTEDIntentPolicy` as no data was provided. You can exclude this policy in the configuration file to avoid this warning.
rasa.shared.utils.io.raise_warning(
2023-06-14 06:12:55 INFO rasa.engine.training.hooks - Finished training component 'UnexpecTEDIntentPolicy'.
Your Rasa model is trained and saved at 'models/20230614-060720-threaded-quadtree.tar.gz'.
You can see that the model is downloaded successfully on the server.
3. Summary
In this post, I demonstrated how to solve the rasa train error, the key solution is to download the right model in your machine. That’s it, thanks for your reading.