2021-10-10 09:36:16.310688: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-10 09:36:22.653331: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-10 09:36:22.653714: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-10 09:36:22.654277: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-10 09:36:22.655253: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-10 09:36:22.655573: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-10 09:36:22.655862: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-10 09:36:24.665440: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-10 09:36:24.665766: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-10 09:36:24.666033: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-10 09:36:24.666270: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 6214 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1
1行目でDeepChem 、2行目でTensorFlowをインポートしています。1行空けて、4行目以降で学習モデルを定義しています。KerasのSequentialクラスを使用して、活性化関数ReLUによる全結合層、正則化のための50%のドロップアウト、そしてスカラー出力を生成する最終層という定義になっています。最後の行で、定義した学習モデルをDeepChem Modelオブジェクトでラップしています。その際に、モデルの学習に使用する損失関数として、今回は L2 loss を用いることを指定しています。
最後に定義した model を用いて、学習と評価を行うことができます。datasets の読込みや学習、評価については、これまでのチュートリアルと同様に記述します。下記で、Delaneyの溶解度データセットを読込み、フィンガープリント(ECFPs)に基づいた学習と予測が実行できます。
In [2]:
tasks,datasets,transformers=dc.molnet.load_delaney(featurizer='ECFP',splitter='random')train_dataset,valid_dataset,test_dataset=datasetsmodel.fit(train_dataset,nb_epoch=50)metric=dc.metrics.Metric(dc.metrics.pearson_r2_score)print('training set score:',model.evaluate(train_dataset,[metric]))print('test set score:',model.evaluate(test_dataset,[metric]))
2021-10-10 09:36:27.135686: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
training set score: {'pearson_r2_score': 0.9805195363948028}
test set score: {'pearson_r2_score': 0.7232708731622692}
importtorchpytorch_model=torch.nn.Sequential(torch.nn.Linear(1024,1000),torch.nn.ReLU(),torch.nn.Dropout(0.5),torch.nn.Linear(1000,1))model=dc.models.TorchModel(pytorch_model,dc.models.losses.L2Loss())model.fit(train_dataset,nb_epoch=50)print('training set score:',model.evaluate(train_dataset,[metric]))print('test set score:',model.evaluate(test_dataset,[metric]))
training set score: {'pearson_r2_score': 0.9804940146138569}
test set score: {'pearson_r2_score': 0.7109119459328586}
このモデルをBACEデータセットで学習してみましょう。このデータセットは、dc.molnet.load_bace_classificationで読み込むことができます。これは、ある分子がBACE-1(β-site Amyloid precursor protein Cleaving Enzyme 1、アミロイドβ前駆タンパク質(APP)を切断する酵素の1つ)を阻害するか、しないかを予測する、2値分類です。以下で、データセットの読込み、学習、評価を実行できます。
In [5]:
tasks,datasets,transformers=dc.molnet.load_bace_classification(feturizer='ECFP',split='scaffold')train_dataset,valid_dataset,test_dataset=datasetsmodel.fit(train_dataset,nb_epoch=100)metric=dc.metrics.Metric(dc.metrics.roc_auc_score)print('training set score:',model.evaluate(train_dataset,[metric]))print('test set score:',model.evaluate(test_dataset,[metric]))
'split' is deprecated. Use 'splitter' instead.
training set score: {'roc_auc_score': 0.999631207655235}
test set score: {'roc_auc_score': 0.7830163043478261}
コメント