• Guest, before posting your code please take these rules into consideration:
    • It is required to use our BBCode feature to display your code. While within the editor click < / > or >_ and place your code within the BB Code prompt. This helps others with finding a solution by making it easier to read and easier to copy.
    • You can also use markdown to share your code. When using markdown your code will be automatically converted to BBCode. For help with markdown check out the markdown guide.
    • Don't share a wall of code. All we want is the problem area, the code related to your issue.


    To learn more about how to use our BBCode feature, please click here.

    Thank you, Code Forum.

Python Tried many times to fix the string prediction problem but...

I am running this code below:
Python:
# Import modules
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

"""
This module is a string prediction model using LSTM.
It takes a file of strings composed of digits from 0 to 9 and splits them into input and target sequences.
The input sequence is the first five characters and the target sequence is the last five characters shifted by one position.
The model learns to predict the next character in the sequence given the previous five characters.
"""
# Define some constants
VOCAB_SIZE = 10 # number of possible tokens (digits from 0 to 9)
EMBED_SIZE = 32 # size of the embedding vectors
RNN_UNITS = 32 # size of the LSTM output vectors
BATCH_SIZE = 20 # number of sequences to process in each batch

# Import pandas
import pandas as pd

# Define the file path and name
file_path = "C:\\Users\\PC-1\\Desktop\\stringpred.txt"

# Read the file into a DataFrame using pandas.read_csv function
df = pd.read_csv(file_path, header=None)

# Convert the strings in the DataFrame to numeric values by removing the spaces and using pd.to_numeric function
df = df.apply(lambda x: pd.to_numeric(x.str.replace(" ", "")))

# Convert the DataFrame to a numpy array using df.values attribute
arrays = df.values

# Define a function to split the arrays into input and target sequences
def split_sequences(arrays):
    """
    This function splits each array into an input sequence and a target sequence.
    The input sequence is the first five characters and the target sequence is the last five characters shifted by one position.
    """
    # Initialize empty lists to store the input and target sequences
    input_sequences = []
    target_sequences = []

    # Loop over each array in the list
    for a in arrays:
        # Slice the array into input and target sequences
        input_sequence = a[:-1]
        target_sequence = a[1:]

        # Append the sequences to the corresponding lists
        input_sequences.append(input_sequence)
        target_sequences.append(target_sequence)

    return input_sequences, target_sequences

# Split the arrays into input and target sequences using split_sequences function
input_sequences, target_sequences = split_sequences(arrays)

# Split the data into training and testing sets with a ratio of 0.8:0.2 using train_test_split function from sklearn module
X_train, X_test, y_train, y_test = train_test_split(input_sequences, target_sequences, test_size=0.2, random_state=42)

# Reshape the input and target sequences into two-dimensional arrays using np.reshape function from numpy module
X_train = np.reshape(X_train, (-1, 5))
y_train = np.reshape(y_train, (-1, 5))
X_test = np.reshape(X_test, (-1, 5))
y_test = np.reshape(y_test, (-1, 5))


# Add some padding cells to the X_test and y_test arrays until they are divisible by 5 using np.pad function from numpy module
X_test = np.pad(X_test, (0, 5 - len(X_test) % 5), mode="constant")
y_test = np.pad(y_test, (0, 5 - len(y_test) % 5), mode="constant")

# Convert the input and target arrays to numpy arrays of float32 data type using np.asarray function from numpy module
X_train = np.asarray(X_train, dtype=np.float32)
y_train = np.asarray(y_train, dtype=np.float32)
X_test = np.asarray(X_test, dtype=np.float32)
y_test = np.asarray(y_test, dtype=np.float32)

# Define a function to generate a new string given a seed string
def generate_string(seed, model, subarrays):
    """
    This function generates a new string given a seed string using the trained model.
    It predicts the probabilities for the next token using the model and samples from them or takes the most likely token.
    It updates the seed array with the new token and repeats this process for six positions in the sequence.
    It returns the generated string as a concatenation of the tokens.
    """
    # Convert the seed string to an array of tokens
    seed_array = np.array([int(c) for c in seed])

    # Initialize an empty list to store the generated tokens
    output_array = []

    # Loop for six positions in the sequence
    for i in range(6):
        # Predict the probabilities for the next token using the model
        # Loop over the subarrays and concatenate the results
        probs = np.concatenate([model.predict(sub) for sub in subarrays], axis=0)

        # Sample from the probabilities or take the most likely token
        # Here we use sampling for more diversity, but you can change it as you like
        next_token = np.random.choice(VOCAB_SIZE, p=probs[0, -1])

        # Append the token to the output list
        output_array.append(next_token)


        # Update the seed array with the new token
        seed_array = np.append(seed_array[1:], next_token)

    # Convert the output list to a string and return it
    output_string = "".join(map(str, output_array))
    return output_string

# Read the file into a DataFrame using pandas.read_csv function
df = pd.read_csv(file_path, header=None)

# Convert the strings in the DataFrame to numeric values by removing the spaces and using pd.to_numeric function
df = df.apply(lambda x: pd.to_numeric(x.str.replace(" ", "")))

# Convert the DataFrame to a numpy array using df.values attribute
arrays = df.values

# Define a function to split the arrays into input and target sequences
def split_sequences(arrays):
    """
    This function splits each array into an input sequence and a target sequence.
    The input sequence is the first five characters and the target sequence is the last five characters shifted by one position.
    """
    # Initialize empty lists to store the input and target sequences
    input_sequences = []
    target_sequences = []

    # Loop over each array in the list
    for a in arrays:
        # Slice the array into input and target sequences
        input_sequence = a[:-1]
        target_sequence = a[1:]

        # Append the sequences to the corresponding lists
        input_sequences.append(input_sequence)
        target_sequences.append(target_sequence)

    return input_sequences, target_sequences

# Define a function to generate a new string given a seed string
def generate_string(seed, model, subarrays):
    """
    This function generates a new string given a seed string using the trained model.
    It predicts the probabilities for the next token using the model and samples from them or takes the most likely token.
    It updates the seed array with the new token and repeats this process for six positions in the sequence.
    It returns the generated string as a concatenation of the tokens.
    """
    # Convert the seed string to an array of tokens
    seed_array = np.array([int(c) for c in seed])

    # Initialize an empty list to store the generated tokens
    output_array = []

    # Loop for six positions in the sequence
    for i in range(6):
        # Predict the probabilities for the next token using the model
        # Loop over the subarrays and concatenate the results
        probs = np.concatenate([model.predict(sub) for sub in subarrays], axis=0)

        # Sample from the probabilities or take the most likely token
        # Here we use sampling for more diversity, but you can change it as you like
        next_token = np.random.choice(VOCAB_SIZE, p=probs[0, -1])

        # Append the token to the output list
        output_array.append(next_token)

        # Update the seed array with the new token
        seed_array = np.append(seed_array[1:], next_token)

    # Convert the output list to a string and return it
    output_string = "".join(map(str, output_array))

    return output_string

# Read and convert the strings from the file using read_strings function
# arrays = read_strings(file_path)

# Split the arrays into input and target sequences using split_sequences function
input_sequences, target_sequences = split_sequences(arrays)


# Split the data into training and testing sets with a ratio of 0.8:0.2 using train_test_split function from sklearn module
X_train, X_test, y_train, y_test = train_test_split(input_sequences, target_sequences, test_size=0.2, random_state=42)

# Reshape the input and target sequences into two-dimensional arrays using np.reshape function from numpy module
X_train = np.reshape(X_train, (-1, 5))
y_train = np.reshape(y_train, (-1, 5))
X_test = np.reshape(X_test, (-1, 5))
y_test = np.reshape(y_test, (-1, 5))

# Add some padding cells to the X_test array until it is divisible by 5 using np.pad function from numpy module
X_test = np.pad(X_test, (0, 5 - len(X_test) % 5), mode="constant")
y_test = np.pad(y_test, (0, 5 - len(y_test) % 5), mode="constant")

# Convert the input and target arrays to numpy arrays of float32 data type using np.asarray and astype functions from numpy module
X_train = np.asarray(X_train, dtype=np.float32)
y_train = np.asarray(y_train, dtype=np.float32)
X_test = np.asarray(X_test, dtype=np.float32)
y_test = np.asarray(y_test, dtype=np.float32)

# Split the X_test array into subarrays of size 5 using np.array_split function from numpy module
subarrays = np.array_split(X_test, len(X_test) / 5)

# Define the model architecture using keras.Sequential class from tensorflow module
model = keras.Sequential([
    # Embedding layer that maps tokens to vectors using layers.Embedding class from tensorflow module
    layers.Embedding(input_dim=VOCAB_SIZE, output_dim=EMBED_SIZE),
    # LSTM layer that processes the embedded vectors using layers.LSTM class from tensorflow module
    layers.LSTM(units=RNN_UNITS, return_sequences=True),
    # Dense layer that outputs probabilities over tokens using layers.Dense class from tensorflow module
    layers.Dense(units=VOCAB_SIZE, activation="softmax")
])

# Compile the model with loss and optimizer using model.compile method from tensorflow module
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")

# Train the model for some epochs using model.fit method from tensorflow module
model.fit(X_train, y_train, batch_size=BATCH_SIZE, epochs=10)

# Test the generate_string function with some seed strings
print(generate_string("55420", model, subarrays))
print(generate_string("13120", model, subarrays))
print(generate_string("25050", model, subarrays))

Initially, I had this recurring error message (for three times) as I ran the code:
Code:
Traceback (most recent call last):
  File "C:/Users/PC-1/Desktop/String Predict ver03-A-1.py", line 182, in <module>
    arrays = read_strings(file_path)
NameError: name 'read_strings' is not defined

That refers to this line here:
Python:
arrays = read_strings(file_path)
...so I had that turn into a comment so it won't mess up the execution, then ran the code again.

Now it is giving me this error message:
Code:
Epoch 1/10
Traceback (most recent call last):
  File "C:/Users/PC-1/Desktop/String Predict ver03-A-1.py", line 223, in <module>
    model.fit(X_train, y_train, batch_size=BATCH_SIZE, epochs=10)
  File "C:\Users\PC-1\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\PC-1\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\engine\training.py", line 1754, in fit
    raise ValueError(
ValueError: Unexpected result of `train_function` (Empty logs). This could be due to issues in input pipeline that resulted in an empty dataset. Otherwise, please use `Model.compile(..., run_eagerly=True)`, or `tf.config.run_functions_eagerly(True)` for more information of where went wrong, or file a issue/bug to `tf.keras`.

Am at my wits' end here - can anyone tell me what to fix here?

If it would help clarify my problem, that code is meant to solve this particular programming problem:
Code:
Create a Python source code that will predict the next unique string to appear based on a list of six-character strings ranging from 0 to 5 stored in the windows text file, "stringpred.txt". As an example of what the list of strings look like, refer to the section below:
...
5 5 4 2 0 5
5 4 1 4 5 5
4 4 4 2 2 0
1 3 1 2 0 1
1 2 4 4 5 5
3 2 1 4 5 5
5 1 5 2 5 4
0 1 5 5 5 4
3 3 1 5 3 5
5 3 3 4 3 5
0 5 3 3 0 2
3 3 0 3 5 1
5 2 2 5 4 0
3 4 3 5 2 3
4 5 2 3 4 5
3 0 4 4 5 5
2 1 2 4 5 5
4 3 0 0 1 5
4 3 2 2 2 4
2 5 0 5 0 3
3 5 1 3 4 4
...

Format output as..
   "The next predicted string will be:

As an example:
3 0 4 4 5 5
2 1 2 4 5 5
4 3 0 0 1 5
4 3 2 2 2 4
2 5 0 5 0 3

The next predicted string will be: 3 5 1 3 4 4

If this is really hard to solve, where other forum site can I go to that can help address this roadblock I ran into?
 
Solution
I am running this code below:
Python:
# Import modules
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

"""
This module is a string prediction model using LSTM.
It takes a file of strings composed of digits from 0 to 9 and splits them into input and target sequences.
The input sequence is the first five characters and the target sequence is the last five characters shifted by one position.
The model learns to predict the next character in the sequence given the previous five characters.
"""
# Define some constants
VOCAB_SIZE = 10 # number of possible tokens (digits from 0 to 9)
EMBED_SIZE = 32 # size of the...
I am running this code below:
Python:
# Import modules
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

"""
This module is a string prediction model using LSTM.
It takes a file of strings composed of digits from 0 to 9 and splits them into input and target sequences.
The input sequence is the first five characters and the target sequence is the last five characters shifted by one position.
The model learns to predict the next character in the sequence given the previous five characters.
"""
# Define some constants
VOCAB_SIZE = 10 # number of possible tokens (digits from 0 to 9)
EMBED_SIZE = 32 # size of the embedding vectors
RNN_UNITS = 32 # size of the LSTM output vectors
BATCH_SIZE = 20 # number of sequences to process in each batch

# Import pandas
import pandas as pd

# Define the file path and name
file_path = "C:\\Users\\PC-1\\Desktop\\stringpred.txt"

# Read the file into a DataFrame using pandas.read_csv function
df = pd.read_csv(file_path, header=None)

# Convert the strings in the DataFrame to numeric values by removing the spaces and using pd.to_numeric function
df = df.apply(lambda x: pd.to_numeric(x.str.replace(" ", "")))

# Convert the DataFrame to a numpy array using df.values attribute
arrays = df.values

# Define a function to split the arrays into input and target sequences
def split_sequences(arrays):
    """
    This function splits each array into an input sequence and a target sequence.
    The input sequence is the first five characters and the target sequence is the last five characters shifted by one position.
    """
    # Initialize empty lists to store the input and target sequences
    input_sequences = []
    target_sequences = []

    # Loop over each array in the list
    for a in arrays:
        # Slice the array into input and target sequences
        input_sequence = a[:-1]
        target_sequence = a[1:]

        # Append the sequences to the corresponding lists
        input_sequences.append(input_sequence)
        target_sequences.append(target_sequence)

    return input_sequences, target_sequences

# Split the arrays into input and target sequences using split_sequences function
input_sequences, target_sequences = split_sequences(arrays)

# Split the data into training and testing sets with a ratio of 0.8:0.2 using train_test_split function from sklearn module
X_train, X_test, y_train, y_test = train_test_split(input_sequences, target_sequences, test_size=0.2, random_state=42)

# Reshape the input and target sequences into two-dimensional arrays using np.reshape function from numpy module
X_train = np.reshape(X_train, (-1, 5))
y_train = np.reshape(y_train, (-1, 5))
X_test = np.reshape(X_test, (-1, 5))
y_test = np.reshape(y_test, (-1, 5))


# Add some padding cells to the X_test and y_test arrays until they are divisible by 5 using np.pad function from numpy module
X_test = np.pad(X_test, (0, 5 - len(X_test) % 5), mode="constant")
y_test = np.pad(y_test, (0, 5 - len(y_test) % 5), mode="constant")

# Convert the input and target arrays to numpy arrays of float32 data type using np.asarray function from numpy module
X_train = np.asarray(X_train, dtype=np.float32)
y_train = np.asarray(y_train, dtype=np.float32)
X_test = np.asarray(X_test, dtype=np.float32)
y_test = np.asarray(y_test, dtype=np.float32)

# Define a function to generate a new string given a seed string
def generate_string(seed, model, subarrays):
    """
    This function generates a new string given a seed string using the trained model.
    It predicts the probabilities for the next token using the model and samples from them or takes the most likely token.
    It updates the seed array with the new token and repeats this process for six positions in the sequence.
    It returns the generated string as a concatenation of the tokens.
    """
    # Convert the seed string to an array of tokens
    seed_array = np.array([int(c) for c in seed])

    # Initialize an empty list to store the generated tokens
    output_array = []

    # Loop for six positions in the sequence
    for i in range(6):
        # Predict the probabilities for the next token using the model
        # Loop over the subarrays and concatenate the results
        probs = np.concatenate([model.predict(sub) for sub in subarrays], axis=0)

        # Sample from the probabilities or take the most likely token
        # Here we use sampling for more diversity, but you can change it as you like
        next_token = np.random.choice(VOCAB_SIZE, p=probs[0, -1])

        # Append the token to the output list
        output_array.append(next_token)


        # Update the seed array with the new token
        seed_array = np.append(seed_array[1:], next_token)

    # Convert the output list to a string and return it
    output_string = "".join(map(str, output_array))
    return output_string

# Read the file into a DataFrame using pandas.read_csv function
df = pd.read_csv(file_path, header=None)

# Convert the strings in the DataFrame to numeric values by removing the spaces and using pd.to_numeric function
df = df.apply(lambda x: pd.to_numeric(x.str.replace(" ", "")))

# Convert the DataFrame to a numpy array using df.values attribute
arrays = df.values

# Define a function to split the arrays into input and target sequences
def split_sequences(arrays):
    """
    This function splits each array into an input sequence and a target sequence.
    The input sequence is the first five characters and the target sequence is the last five characters shifted by one position.
    """
    # Initialize empty lists to store the input and target sequences
    input_sequences = []
    target_sequences = []

    # Loop over each array in the list
    for a in arrays:
        # Slice the array into input and target sequences
        input_sequence = a[:-1]
        target_sequence = a[1:]

        # Append the sequences to the corresponding lists
        input_sequences.append(input_sequence)
        target_sequences.append(target_sequence)

    return input_sequences, target_sequences

# Define a function to generate a new string given a seed string
def generate_string(seed, model, subarrays):
    """
    This function generates a new string given a seed string using the trained model.
    It predicts the probabilities for the next token using the model and samples from them or takes the most likely token.
    It updates the seed array with the new token and repeats this process for six positions in the sequence.
    It returns the generated string as a concatenation of the tokens.
    """
    # Convert the seed string to an array of tokens
    seed_array = np.array([int(c) for c in seed])

    # Initialize an empty list to store the generated tokens
    output_array = []

    # Loop for six positions in the sequence
    for i in range(6):
        # Predict the probabilities for the next token using the model
        # Loop over the subarrays and concatenate the results
        probs = np.concatenate([model.predict(sub) for sub in subarrays], axis=0)

        # Sample from the probabilities or take the most likely token
        # Here we use sampling for more diversity, but you can change it as you like
        next_token = np.random.choice(VOCAB_SIZE, p=probs[0, -1])

        # Append the token to the output list
        output_array.append(next_token)

        # Update the seed array with the new token
        seed_array = np.append(seed_array[1:], next_token)

    # Convert the output list to a string and return it
    output_string = "".join(map(str, output_array))

    return output_string

# Read and convert the strings from the file using read_strings function
# arrays = read_strings(file_path)

# Split the arrays into input and target sequences using split_sequences function
input_sequences, target_sequences = split_sequences(arrays)


# Split the data into training and testing sets with a ratio of 0.8:0.2 using train_test_split function from sklearn module
X_train, X_test, y_train, y_test = train_test_split(input_sequences, target_sequences, test_size=0.2, random_state=42)

# Reshape the input and target sequences into two-dimensional arrays using np.reshape function from numpy module
X_train = np.reshape(X_train, (-1, 5))
y_train = np.reshape(y_train, (-1, 5))
X_test = np.reshape(X_test, (-1, 5))
y_test = np.reshape(y_test, (-1, 5))

# Add some padding cells to the X_test array until it is divisible by 5 using np.pad function from numpy module
X_test = np.pad(X_test, (0, 5 - len(X_test) % 5), mode="constant")
y_test = np.pad(y_test, (0, 5 - len(y_test) % 5), mode="constant")

# Convert the input and target arrays to numpy arrays of float32 data type using np.asarray and astype functions from numpy module
X_train = np.asarray(X_train, dtype=np.float32)
y_train = np.asarray(y_train, dtype=np.float32)
X_test = np.asarray(X_test, dtype=np.float32)
y_test = np.asarray(y_test, dtype=np.float32)

# Split the X_test array into subarrays of size 5 using np.array_split function from numpy module
subarrays = np.array_split(X_test, len(X_test) / 5)

# Define the model architecture using keras.Sequential class from tensorflow module
model = keras.Sequential([
    # Embedding layer that maps tokens to vectors using layers.Embedding class from tensorflow module
    layers.Embedding(input_dim=VOCAB_SIZE, output_dim=EMBED_SIZE),
    # LSTM layer that processes the embedded vectors using layers.LSTM class from tensorflow module
    layers.LSTM(units=RNN_UNITS, return_sequences=True),
    # Dense layer that outputs probabilities over tokens using layers.Dense class from tensorflow module
    layers.Dense(units=VOCAB_SIZE, activation="softmax")
])

# Compile the model with loss and optimizer using model.compile method from tensorflow module
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")

# Train the model for some epochs using model.fit method from tensorflow module
model.fit(X_train, y_train, batch_size=BATCH_SIZE, epochs=10)

# Test the generate_string function with some seed strings
print(generate_string("55420", model, subarrays))
print(generate_string("13120", model, subarrays))
print(generate_string("25050", model, subarrays))

Initially, I had this recurring error message (for three times) as I ran the code:
Code:
Traceback (most recent call last):
  File "C:/Users/PC-1/Desktop/String Predict ver03-A-1.py", line 182, in <module>
    arrays = read_strings(file_path)
NameError: name 'read_strings' is not defined

That refers to this line here:
Python:
arrays = read_strings(file_path)
...so I had that turn into a comment so it won't mess up the execution, then ran the code again.

Now it is giving me this error message:
Code:
Epoch 1/10
Traceback (most recent call last):
  File "C:/Users/PC-1/Desktop/String Predict ver03-A-1.py", line 223, in <module>
    model.fit(X_train, y_train, batch_size=BATCH_SIZE, epochs=10)
  File "C:\Users\PC-1\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\PC-1\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\engine\training.py", line 1754, in fit
    raise ValueError(
ValueError: Unexpected result of `train_function` (Empty logs). This could be due to issues in input pipeline that resulted in an empty dataset. Otherwise, please use `Model.compile(..., run_eagerly=True)`, or `tf.config.run_functions_eagerly(True)` for more information of where went wrong, or file a issue/bug to `tf.keras`.

Am at my wits' end here - can anyone tell me what to fix here?

If it would help clarify my problem, that code is meant to solve this particular programming problem:
Code:
Create a Python source code that will predict the next unique string to appear based on a list of six-character strings ranging from 0 to 5 stored in the windows text file, "stringpred.txt". As an example of what the list of strings look like, refer to the section below:
...
5 5 4 2 0 5
5 4 1 4 5 5
4 4 4 2 2 0
1 3 1 2 0 1
1 2 4 4 5 5
3 2 1 4 5 5
5 1 5 2 5 4
0 1 5 5 5 4
3 3 1 5 3 5
5 3 3 4 3 5
0 5 3 3 0 2
3 3 0 3 5 1
5 2 2 5 4 0
3 4 3 5 2 3
4 5 2 3 4 5
3 0 4 4 5 5
2 1 2 4 5 5
4 3 0 0 1 5
4 3 2 2 2 4
2 5 0 5 0 3
3 5 1 3 4 4
...

Format output as..
   "The next predicted string will be:

As an example:
3 0 4 4 5 5
2 1 2 4 5 5
4 3 0 0 1 5
4 3 2 2 2 4
2 5 0 5 0 3

The next predicted string will be: 3 5 1 3 4 4

If this is really hard to solve, where other forum site can I go to that can help address this roadblock I ran into?
Hi there,
So off the bat, the error message points to a very specific line in your code:
Python:
model.fit(X_train, y_train, batch_size=BATCH_SIZE, epochs=10)
Chances that you are getting this error because you commented out that one other line of code that was erroring out, are very high ;)

Speaking of that other line of code:
Python:
arrays = read_strings(file_path)
The error message there states that the function read_strings is not defined, meaning: that function definition does not exist ;) HINT HINT WINK WINK...
 
Solution
So you're saying these two lines of code:
This>>>
Python:
model.fit(X_train, y_train, batch_size=BATCH_SIZE, epochs=10)

...and this>>>
Python:
arrays = read_strings(file_path)

...are connected - and that commenting out this line:
Python:
arrays = read_strings(file_path)

...is a bad idea?

By the way, isn't it that in Python, there's no need to define "read_strings"?
 
So you're saying these two lines of code:
This>>>
Python:
model.fit(X_train, y_train, batch_size=BATCH_SIZE, epochs=10)

...and this>>>
Python:
arrays = read_strings(file_path)

...are connected - and that commenting out this line:
Python:
arrays = read_strings(file_path)

...is a bad idea?

By the way, isn't it that in Python, there's no need to define "read_strings"?
I don't believe read_strings is a built-in function of python...I will double check, but yes, the reason, at least based on the error messages, is that since read_strings is not a defined function anywhere in your code or imported code, the variable "arrays" is not being set which if I am reading your code correctly, is necessary for other variables and eventually to X_train, Y_train. So best recommendation is to start there, either define your own implementation for that method, or see if there is a function already built-in that will read strings in a batch. You may need to change your code a bit depending on what implementation you choose. Below is a reference for file methods

 
Top Bottom