Anyone good with tensorflow please help

Hello I’m in need of help of someone who knows TensorFlow well. I’ve been running into some pretty random issues and I don’t know how to fix them here is the code:

#Libraries: Pandas, TensorFlow Keras, SKlearn model split, Numpy
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
import numpy as np

#Get file
dataset = pd.read_csv('cancer.csv')

#X and Y values for the columns of data
x = dataset.drop(columns=["diagnosis(1=m, 0=b)"]) #Everything except the diagnosis variable (independent)
y = dataset["diagnosis(1=m, 0=b)"] #The diagnosis column the dependent variable

#Perform the split data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=2)

#Make the model variable (the actual model)
model = tf.keras.models.Sequential()

#Make the TensorFlow layers (tensorflow.org)
model.add(tf.keras.layers.Dense(256, input_shape = x_train.shape, activation="sigmoid"))
model.add(tf.keras.layers.Dense(256, activation="sigmoid"))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

#Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Get the shape of the training data
y_train_shape = np.shape(y_train)
x_train_shape = np.shape(x_train)

#KEEP THE RUNTIME GOING HERE (SPACE)

# Print the shape of the training data
print('X shape =', x_train_shape)
print('Y shape =', y_train_shape)

#fit the model
model.fit(x_train, y_train, epochs=1000)
model.save('cancerTumor.model')

#Evaluate the model
model.evaluate(x_test, y_test)

Here is the error:

X shape = (567, 30)
Y shape = (567,)
Epoch 1/1000
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-0828f4e05374> in <cell line: 39>()
     37 
     38 #fit the model
---> 39 model.fit(x_train, y_train, epochs=1000)
     40 model.save('cancerTumor.model')
     41 

1 frames
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py in tf__train_function(iterator)
     13                 try:
     14                     do_return = True
---> 15                     retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
     16                 except:
     17                     do_return = False

ValueError: in user code:

    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1377, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1360, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1349, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1126, in train_step
        y_pred = self(x, training=True)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/input_spec.py", line 298, in assert_input_compatibility
        raise ValueError(

    ValueError: Input 0 of layer "sequential_10" is incompatible with the layer: expected shape=(None, 567, 30), found shape=(None, 30)

Here Is the turtorial I used and you can download the file in the link:

Thanks,
Zm476

Hi - This is not really a Python question, but a TF/Keras question.
There is a separate forum for TF: https://discuss.tensorflow.org/

I watched the linked video and as a very first introduction I think it’s actually not too bad. I don’t understand why the same problem doesn’t appear in the video – I wonder if this might be because of yet another backwards incompatible change in TF or Keras (the video is already several years old).

In your code snippet you are not following the video however: the train_test_split should use test_size=.2 not 2 (2 leads to a test size that is way too small).

– PS –
Something definitely must have changed in TF or Keras - but I’m not a TF user, so don’t know what.
To make the code work (with the current TF), you need to change the definition of the first two model layers

model.add(tf.keras.layers.Dense(256, input_shape = x_train.shape, activation="sigmoid"))
model.add(tf.keras.layers.Dense(256, activation="sigmoid"))  # keeping this also works but doesn't really makes sense

to just one layer:

model.add(tf.keras.layers.Dense(256, activation="sigmoid"))

You can look at the model architecture by calling model.summary() (after you’ve trained it).

Hello @hansgeunsmeyer,

I am very aware of the other forum but I can’t use it otherwise I would post this there not here.
The model.summary() command got me this and I changed the text_size but it still gave me the same error and output:

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_3 (Dense)             (None, 455, 256)          7936      
                                                                 
 dense_4 (Dense)             (None, 455, 256)          65792     
                                                                 
 dense_5 (Dense)             (None, 455, 1)            257       
                                                                 
=================================================================
Total params: 73985 (289.00 KB)
Trainable params: 73985 (289.00 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
X shape = (455, 30)
Y shape = (455,)
Epoch 1/1000
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-22c835f0ff92> in <cell line: 39>()
     37 
     38 #fit the model
---> 39 model.fit(x_train, y_train, epochs=1000)
     40 model.save('cancerTumor.model')
     41 

1 frames
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py in tf__train_function(iterator)
     13                 try:
     14                     do_return = True
---> 15                     retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
     16                 except:
     17                     do_return = False

ValueError: in user code:

    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1377, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1360, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1349, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1126, in train_step
        y_pred = self(x, training=True)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/input_spec.py", line 298, in assert_input_compatibility
        raise ValueError(

    ValueError: Input 0 of layer "sequential_1" is incompatible with the layer: expected shape=(None, 455, 30), found shape=(None, 30)

Did you actually make the modifications I suggested?

print(tf.__version__)   # you should have at least version 2.12.0

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)  
# 0.2, not 2

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(256, activation='sigmoid'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model.fit(x_train, y_train, epochs=200)  # 1000 is way too many

Alternatively you could also define the model as

features = (x_train.shape[1],)  # == (30,)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(256, input_shape=features, activation='sigmoid'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Btw, one of the things that are pretty bad in that tutorial video is setting the number of training epochs to 1000 - This is an extremely small training and test set, and even a simple logistic regression almost immediately gets an accuracy of about 90%. If you set it to 100 or 200, that’s high enough, and even then it might already be overfitting.

With these settings the summary looks like:

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 256)               7936      
                                                                 
 dense_1 (Dense)             (None, 1)                 257       
                                                                 
=================================================================
Total params: 8,193
Trainable params: 8,193
Non-trainable params: 0

And it reaches a score on the test set of about 0.93.

Btw For study and background material you will learn more if you work through the tutorials on Kaggle or HuggingFace, or the ones on the Google Coolab or PyTorch websites. Tutorials on YouTube from random coders may not have that much value and don’t really teach you much (or anything) about what is going on here. This video is kind of OK as a very first 5-min introduction to keras but I’d still say it’s subtly misleading in various ways (one of them being the sensational title).

I’m using Kaggle currently and can you make your suggestions into the full code please.

Also it just generally shoujldn’t just not read a shape correctly

agreed on the youtube video

Here is the full code:

import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf

assert tf.__version__ == "2.12.0"

df = pd.read_csv("cancer.csv")
x = df.drop(columns=["diagnosis(1=m, 0=b)"])
y = df["diagnosis(1=m, 0=b)"]
assert x.shape == (569, 30)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(256, activation='sigmoid'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', 
              metrics=['accuracy'])
model.fit(x_train, y_train, epochs=200)

model.summary()
model.evaluate(x_test, y_test)

thank you,
I’ll try it

Hello,

It works thank you so much.

Best Regards,
Zm476

I want this to be really accurate so I’m going really high on the epochs but thanks for the recomendation

At some point the model will not learn anymore, if you keep increasing the epochs, and may start getting worse results on the evaluation test set. You could try to determine what that point is and exactly how you identify it (it’s not completely trivial since the training loss and score may become jittery).

It may also make sense – with such a small testset – to see how small you can make the model (small nr of units in the hidden layer, small nr of parameters) and how small you can make the nr of epochs and still get more or less the same score on a testset. The question what it actually means to have a score of say 95% accuracy on a test set like this, or if that is really a signficant difference with say 90%, is also not at all trivial.

1 Like

Great thanks! :smile:

Hello @hansgeunsmeyer,

I’ve run some tests and here is the epochs to accuracy and loss ratio for 120 KB of data:

Accuracy and Loss

1 Like